polynomial-time algorithms for prime factorization

26
POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION AND DISCRETE LOGARITHMS ON A QUANTUM COMPUTER * PETER W. SHOR SIAM J. COMPUT. c 1997 Society for Industrial and Applied Mathematics Vol. 26, No. 5, pp. 1484–1509, October 1997 009 Abstract. A digital computer is generally believed to be an efficient universal computing device; that is, it is believed able to simulate any physical computing device with an increase in computation time by at most a polynomial factor. This may not be true when quantum mechanics is taken into consideration. This paper considers factoring integers and finding discrete logarithms, two problems which are generally thought to be hard on a classical computer and which have been used as the basis of several proposed cryptosystems. Efficient randomized algorithms are given for these two problems on a hypothetical quantum computer. These algorithms take a number of steps polynomial in the input size, e.g., the number of digits of the integer to be factored. Key words. algorithmic number theory, prime factorization, discrete logarithms, Church’s thesis, quantum computers, foundations of quantum mechanics, spin systems, Fourier transforms AMS subject classifications. 81P10, 11Y05, 68Q10, 03D10 PII. S0097539795293172 1. Introduction. One of the first results in the mathematics of computation, which underlies the subsequent development of much of theoretical computer science, was the distinction between computable and noncomputable functions shown in pa- pers of Church [1936], Post [1936], and Turing [1936]. The observation that several apparently different definitions of what it meant for a function to be computable yielded the same set of computable functions led to the proposal of Church’s thesis, which says that all computing devices can be simulated by a Turing machine. This thesis greatly simplifies the study of computation, since it reduces the potential field of study from any of an infinite number of potential computing devices to Turing ma- chines. Church’s thesis is not a mathematical theorem; to make it one would require a precise mathematical description of a computing device. Such a description, how- ever, would leave open the possibility of some practical computing device which did not satisfy this precise mathematical description and thus would make the resulting theorem weaker than Church’s original thesis. With the development of practical computers, it became apparent that the dis- tinction between computable and noncomputable functions was much too coarse; com- puter scientists are now interested in the exact efficiency with which specific functions can be computed. This exact efficiency, on the other hand, was found to be too precise a quantity to work with easily. The generally accepted compromise between coarse- ness and precision distinguishes efficiently from inefficiently computable functions by whether the length of the computation scales polynomially or superpolynomially with the input size. The class of problems which can be solved by algorithms having a number of steps polynomial in the input size is known as P. For this classification to make sense, it must be machine independent. That is, the question of whether a function is computable in polynomial time must be independent * Received by the editors October 16, 1995; accepted for publication (in revised form) December 2, 1996. This paper is an expanded version of a paper that appeared in Proc. 35th Annual Symposium on Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 1994, pp. 124–134. http://www.siam.org/journals/sicomp/26-5/29317.html AT&T Labs–Research, Room C237, 180 Park Avenue, Florham Park, NJ 07932 (shor@ research.att.com). 1484

Upload: others

Post on 22-Jul-2022

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATIONAND DISCRETE LOGARITHMS ON A QUANTUM COMPUTER∗

PETER W. SHOR†

SIAM J. COMPUT. c© 1997 Society for Industrial and Applied MathematicsVol. 26, No. 5, pp. 1484–1509, October 1997 009

Abstract. A digital computer is generally believed to be an efficient universal computing device;that is, it is believed able to simulate any physical computing device with an increase in computationtime by at most a polynomial factor. This may not be true when quantum mechanics is taken intoconsideration. This paper considers factoring integers and finding discrete logarithms, two problemswhich are generally thought to be hard on a classical computer and which have been used as the basisof several proposed cryptosystems. Efficient randomized algorithms are given for these two problemson a hypothetical quantum computer. These algorithms take a number of steps polynomial in theinput size, e.g., the number of digits of the integer to be factored.

Key words. algorithmic number theory, prime factorization, discrete logarithms, Church’sthesis, quantum computers, foundations of quantum mechanics, spin systems, Fourier transforms

AMS subject classifications. 81P10, 11Y05, 68Q10, 03D10

PII. S0097539795293172

1. Introduction. One of the first results in the mathematics of computation,which underlies the subsequent development of much of theoretical computer science,was the distinction between computable and noncomputable functions shown in pa-pers of Church [1936], Post [1936], and Turing [1936]. The observation that severalapparently different definitions of what it meant for a function to be computableyielded the same set of computable functions led to the proposal of Church’s thesis,which says that all computing devices can be simulated by a Turing machine. Thisthesis greatly simplifies the study of computation, since it reduces the potential fieldof study from any of an infinite number of potential computing devices to Turing ma-chines. Church’s thesis is not a mathematical theorem; to make it one would requirea precise mathematical description of a computing device. Such a description, how-ever, would leave open the possibility of some practical computing device which didnot satisfy this precise mathematical description and thus would make the resultingtheorem weaker than Church’s original thesis.

With the development of practical computers, it became apparent that the dis-tinction between computable and noncomputable functions was much too coarse; com-puter scientists are now interested in the exact efficiency with which specific functionscan be computed. This exact efficiency, on the other hand, was found to be too precisea quantity to work with easily. The generally accepted compromise between coarse-ness and precision distinguishes efficiently from inefficiently computable functions bywhether the length of the computation scales polynomially or superpolynomially withthe input size. The class of problems which can be solved by algorithms having anumber of steps polynomial in the input size is known as P.

For this classification to make sense, it must be machine independent. That is, thequestion of whether a function is computable in polynomial time must be independent

∗Received by the editors October 16, 1995; accepted for publication (in revised form) December 2,1996. This paper is an expanded version of a paper that appeared in Proc. 35th Annual Symposiumon Foundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, 1994,pp. 124–134.

http://www.siam.org/journals/sicomp/26-5/29317.html†AT&T Labs–Research, Room C237, 180 Park Avenue, Florham Park, NJ 07932 (shor@

research.att.com).

1484

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 2: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1485

of the type of computing device used. This corresponds to the following quantitativeversion of Church’s thesis, which has been called the “strong Church’s thesis” byVergis, Steiglitz, and Dickinson [1986] and which makes up half of the “invariancethesis” of van Emde Boas [1990].

Thesis 1.1 (quantitative Church’s thesis). Any physical computing device can besimulated by a Turing machine in a number of steps polynomial in the resources usedby the computing device.

Readers who are not comfortable with Turing machines may think instead ofdigital computers having an amount of memory that grows linearly with the length ofthe computation, as these two classes of computing machines can efficiently simulateeach other. In statements of this thesis, the Turing machine is sometimes augmentedwith a random number generator, as it has not yet been determined whether there arepseudorandom number generators which can efficiently simulate truly random numbergenerators for all purposes.

There are two escape clauses in the above thesis. One of these is the word “phys-ical.” Researchers have produced machine models that violate the above quantitativeChurch’s thesis, but most of these have been ruled out by some reason for why theyare not physical, that is, why they could not be built and made to work. The otherescape clause in the above thesis is the word “resources,” the meaning of which isnot completely specified above. There are generally two resources which limit theability of digital computers to solve large problems: time (computational steps) andspace (memory). There are more resources pertinent to analog computation; someproposed analog machines that seem able to solve NP-complete problems in poly-nomial time have required exponentially precise parts or an exponential amount ofenergy. (See Vergis, Steiglitz, and Dickinson [1986] and Steiglitz [1988]; this issue isalso implicit in the papers of Canny and Reif [1987] and Choi, Sellen, and Yap [1995]on three-dimensional shortest paths.)

For quantum computation, in addition to space and time there is also a thirdpotentially important resource: precision. For a quantum computer to work, at leastin any currently envisioned implementation, it must be able to make changes in thequantum states of objects (e.g., atoms, photons, or nuclear spins). These changescan clearly not be perfectly accurate but must contain some small amount of inherentimprecision. If this imprecision is constant (i.e., it does not depend on the size of theinput), then it is not known how to compute any functions in polynomial time on aquantum computer that cannot also be computed in polynomial time on a classicalcomputer with a random number generator1. However, if we let the precision growpolynomially in the input size (so the number of bits of precision grows logarithmicallyin the input size), we appear to obtain a more powerful type of computer. Allowingthe same polynomial growth in precision does not appear to confer extra computingpower to classical mechanics, although allowing exponential growth in precision may[Hartmanis and Simon 1974; Vergis, Steiglitz, and Dickinson 1986].

As far as we know, what precision is possible in quantum state manipulation isdictated not by fundamental physical laws but by the properties of the materials fromwhich and the architecture with which a quantum computer is built. It is currently notclear which architectures, if any, will give high precision, and what this precision willbe. If the precision of a quantum computer is large enough to make it more powerful

1Note added in proof. This is no longer true. See Aharonov and Ben-Or [Proc. 29th AnnualACM Symposium on Theory of Computing, ACM, New York, 1997, pp. 176–188]; Gottesman, Evslin,Kakade, and Preskill [manuscript, 1997]; Kitaev [manuscript, 1997], Knill, Laflamme, and Zurek[LANL e-print quant-ph/9702058, Los Alamos National Laboratories, Los Alamos, NM, 1997].

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 3: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1486 PETER W. SHOR

than a classical computer, then in order to understand its potential it is importantto think of precision as a resource that can vary. Treating the precision as a largeconstant (even though it is almost certain to be constant for any given machine) wouldbe comparable to treating a classical digital computer as a finite automaton—sinceany given computer has a fixed amount of memory, this view is technically correct;however, it is not particularly useful.

Because of the remarkable effectiveness of our mathematical models of computa-tion, computer scientists have tended to forget that computation is dependent on thelaws of physics. This can be seen in the statement of the quantitative Church’s thesisin van Emde Boas [1990], where the word “physical” in the above phrasing is replacedby the word “reasonable.” It is difficult to imagine any definition of “reasonable” inthis context which does not mean “physically realizable,” i.e., that this computingmachine could actually be built and would work.

Computer scientists have become convinced of the truth of the quantitativeChurch’s thesis through the failure of all proposed counterexamples. Most of theseproposed counterexamples have been based on the laws of classical mechanics; how-ever, the universe is in reality quantum mechanical. Quantum mechanical objectsoften behave quite differently from how our intuition, based on classical mechanics,tells us they should. It thus seems plausible that the natural computing power of clas-sical mechanics corresponds to that of Turing machines,2 while the natural computingpower of quantum mechanics might be greater.

The first person to look at the interaction between computation and quantummechanics appears to have been Benioff [1980, 1982a, 1982b]. Although he did notask whether quantum mechanics conferred extra power to computation, he showedthat reversible unitary evolution was sufficient to realize the computational powerof a Turing machine, thus showing that quantum mechanics is computationally atleast as powerful as classical computers. This work was fundamental in making laterinvestigation of quantum computers possible.

Feynman [1982, 1986] seems to have been the first to suggest that quantum me-chanics might be computationally more powerful than Turing machines. He gavearguments as to why quantum mechanics is intrinsically computationally expensiveto simulate on a classical computer. He also raised the possibility of using a computerbased on quantum mechanical principles to avoid this problem, thus implicitly askingthe converse question, “By using quantum mechanics in a computer can you computemore efficiently than on a classical computer?” The first to ask this question explicitlywas Deutsch [1985, 1989]. In order to study this question, he defined both quantumTuring machines and quantum circuits and investigated some of their properties.

The question of whether using quantum mechanics in a computer allows oneto obtain more computational power was more recently addressed by Deutsch andJozsa [1992] and Berthiaume and Brassard [1992, 1994]. These papers showed thatthere are problems which quantum computers can quickly solve exactly, but thatclassical computers can only solve quickly with high probability and the aid of arandom number generator. However, these papers did not show how to solve anyproblem in quantum polynomial time that was not already known to be solvablein polynomial time with the aid of a random number generator, allowing a small

2See Vergis, Steiglitz, and Dickinson [1986], Steiglitz [1988], and Rubel [1989]. I believe thatthis question has not yet been settled and is worthy of further investigation. In particular, tur-bulence seems a good candidate for a counterexample to the quantitative Church’s thesis becausethe nontrivial dynamics on many length scales appear to make it difficult to simulate on a classicalcomputer.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 4: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1487

probability of error; this is the characterization of the complexity class BPP (boundederror probability probabilistic polynomial time), which is widely viewed as the classof efficiently solvable problems.

Further work on this problem was stimulated by Bernstein and Vazirani [1993].One of the results contained in their paper was an oracle problem (that is, a probleminvolving a “black box” subroutine that the computer is allowed to perform but forwhich no code is accessible) which can be done in polynomial time on a quantumTuring machine but which requires superpolynomial time on a classical computer.This result was improved by Simon [1994], who gave a much simpler construction ofan oracle problem which takes polynomial time on a quantum computer but requiresexponential time on a classical computer. Indeed, while Bernstein and Vaziarni’sproblem appears contrived, Simon’s problem looks quite natural. Simon’s algorithminspired the work presented in this paper.

Two number theory problems which have been studied extensively but for whichno polynomial-time algorithms have yet been discovered are finding discrete loga-rithms and factoring integers [Pomerance 1987, Gordon 1993, Lenstra and Lenstra1993, Adleman and McCurley 1994]. These problems are so widely believed to behard that several cryptosystems based on their difficulty have been proposed, includ-ing the widely used RSA public key cryptosystem developed by Rivest, Shamir, andAdleman [1978]. We show that these problems can be solved in polynomial time ona quantum computer with a small probability of error.

Currently, nobody knows how to build a quantum computer, although it seemsas though it might be possible within the laws of quantum mechanics. Some sugges-tions have been made as to possible designs for such computers [Teich, Obermayer,and Mahler 1988; Lloyd 1993, 1994; Cirac and Zoller 1995; DiVincenzo 1995; Sleatorand Weinfurter 1995; Barenco et al. 1995b; Chuang and Yamomoto 1995], but therewill be substantial difficulty in building any of these [Landauer 1995, 1997; Unruh1995; Chuang et al. 1995; Palma, Suominen, and Ekert 1996]. The most difficultobstacles appear to involve the decoherence of quantum superpositions through theinteraction of the computer with the environment, and the implementation of quan-tum state transformations with enough precision to give accurate results after manycomputation steps. Both of these obstacles become more difficult as the size of thecomputer grows, so it may turn out to be possible to build small quantum computers,while scaling up to machines large enough to do interesting computations may presentfundamental difficulties.

Even if no useful quantum computer is ever built, this research does illuminatethe problem of simulating quantum mechanics on a classical computer. Any methodof doing this for an arbitrary Hamiltonian would necessarily be able to simulate aquantum computer. Thus, any general method for simulating quantum mechanicswith at most a polynomial slowdown would lead to a polynomial-time algorithm forfactoring.

The rest of this paper is organized as follows. In section 2, we introduce the modelof quantum computation, the quantum gate array, that we use in the rest of the pa-per. In sections 3 and 4, we explain two subroutines that are used in our algorithms:reversible modular exponentiation in section 3 and quantum Fourier transforms in sec-tion 4. In section 5, we give our algorithm for prime factorization, and in section 6,we give our algorithm for extracting discrete logarithms. In section 7, we give a briefdiscussion of the practicality of quantum computation and suggest possible directionsfor further work.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 5: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1488 PETER W. SHOR

2. Quantum computation. In this section we give a brief introduction to quan-tum computation, emphasizing the properties that we will use. We will describe onlyquantum gate arrays, or quantum acyclic circuits, which are analogous to acyclic cir-cuits in classical computer science. This model was originally studied by Yao [1993]and is closely related to the quantum computational networks discussed by Deutsch[1989]. For other models of quantum computers, see references on quantum Turingmachines [Deutsch 1985; Bernstein and Vazirani 1993; Yao 1993] and quantum cel-lular automata [Feynman 1986; Margolus 1986, 1990; Lloyd 1993; Biafore 1994]. Ifthey are allowed a small probability of error, quantum Turing machines and quantumgate arrays can compute the same functions in polynomial time [Yao 1993]. This mayalso be true for the various models of quantum cellular automata, but it has not yetbeen proved. This gives evidence that the class of functions computable in quantumpolynomial time with a small probability of error is robust in that it does not dependon the exact architecture of a quantum computer. By analogy with the classical classBPP, this class is called BQP (bounded error probability quantum polynomial time).

Consider a system with n components, each of which can have two states.Whereas in classical physics, a complete description of the state of this system re-quires only n bits, in quantum physics, a complete description of the state of thissystem requires 2n − 1 complex numbers. To be more precise, the state of the quan-tum system is a point in a 2n-dimensional vector space. For each of the 2n possibleclassical positions of the components, there is a basis state of this vector space whichwe represent, for example, by |011 · · · 0〉 meaning that the first bit is 0, the second bitis 1, and so on. Here, the ket notation |x〉 means that x is a (pure) quantum state.(Mixed states will not be discussed in this paper, and thus we do not define them;see a quantum theory book such as Peres [1993] for their definition.) The Hilbertspace associated with this quantum system is the complex vector space with these 2n

states as basis vectors, and the state of the system at any time is represented by aunit-length vector in this Hilbert space. As multiplying this state vector by a unit-length complex phase does not change any behavior of the state, we need only 2n− 1complex numbers to completely describe the state. We represent this superpositionof states by

2n−1∑i=0

ai|Si〉,(2.1)

where the amplitudes ai are complex numbers such that∑

i |ai|2 = 1 and each |Si〉is a basis vector of the Hilbert space. If the machine is measured (with respect tothis basis) at any particular step, the probability of seeing basis state |Si〉 is |ai|2;however, measuring the state of the machine projects this state to the observed basisvector |Si〉. Thus, looking at the machine during the computation will invalidate therest of the computation. General quantum mechanical measurements, i.e., POVMs(positive operator valued measurement, see [Peres 1993]), can be considerably morecomplicated than the case of projection onto the canonical basis to which we restrictourselves in this paper. This does not greatly restrict our model of computation, sincemeasurements in other reasonable bases, as well as other local measurements, can besimulated by first using quantum computation to perform a change of basis and thenperforming a measurement in the canonical basis.

In order to use a physical system for computation, we must be able to change thestate of the system. The laws of quantum mechanics permit only unitary transforma-tions of state vectors. A unitary matrix is one whose conjugate transpose is equal to

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 6: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1489

its inverse, and requiring state transformations to be represented by unitary matricesensures that summing the probability over all possible outcomes yields 1. The defi-nition of quantum circuits (and quantum Turing machines) allows only local unitarytransformations—that is, unitary transformations on a fixed number of bits. This isphysically justified because, given a general unitary transformation on n bits, it isnot clear how one could efficiently implement it physically, whereas two-bit transfor-mations can at least in theory be implemented by relatively simple physical systems[Cirac and Zoller 1995; DiVincenzo 1995; Sleator and Weinfurter 1995; Chuang andYamomoto 1995]. While general n-bit transformations can always be built out oftwo-bit transformations [DiVincenzo 1995; Sleator and Weinfurter 1995; Lloyd 1995;Deutsch, Barenco, and Ekert 1995], the number required will often be exponentialin n [Barenco et al. 1995a]. Thus, the set of two-bit transformations form a set ofbuilding blocks for quantum circuits in a manner analogous to the way a universal setof classical gates (such as the AND, OR, and NOT gates) form a set of building blocksfor classical circuits. In fact, for a universal set of quantum gates, it is sufficient totake all one-bit gates and a single type of two-bit gate, the controlled NOT gate (alsocalled the XOR or parity gate), which negates the second bit if and only if the firstbit is 1.

Perhaps an example will be informative at this point. A quantum gate can beexpressed as a truth table: for each input basis vector we need to give the output ofthe gate. One such gate is

|00〉 → |00〉,|01〉 → |01〉,(2.2)

|10〉 → 1√2(|10〉+ |11〉),

|11〉 → 1√2(|10〉 − |11〉).

Not all truth tables correspond to physically feasible quantum gates, as many truthtables will not give rise to unitary transformations.

The same gate can also be represented as a matrix. The rows correspond to inputbasis vectors. The columns correspond to output basis vectors. The (i, j) entry gives,when the ith basis vector is input to the gate, the coefficient of the jth basis vector inthe corresponding output of the gate. The truth table above would then correspondto the following matrix:

|00〉 |01〉 |10〉 |11〉|00〉 1 0 0 0

|01〉 0 1 0 0

|10〉 0 0 1√2

1√2

|11〉 0 0 1√2

− 1√2

.

(2.3)

A quantum gate is feasible if and only if the corresponding matrix is unitary, i.e., itsinverse is its conjugate transpose.

Suppose that our machine is in the superposition of states

1√2|10〉 − 1√

2|11〉(2.4)

and we apply the unitary transformation represented by (2.2) and (2.3) to this state.The resulting output will be the result of multiplying the vector (2.4) by the ma-trix (2.3). The machine will thus go to the superposition of states

12 (|10〉+ |11〉)− 1

2 (|10〉 − |11〉) = |11〉.(2.5)

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 7: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1490 PETER W. SHOR

This example shows the potential effects of interference on quantum computation.Had we started with either the state |10〉 or the state |11〉, there would have been achance of observing the state |10〉 after the application of the gate (2.3). However,when we start with a superposition of these two states, the probability amplitudes forthe state |10〉 cancel, and we have no possibility of observing |10〉 after the applicationof the gate. Notice that the output of the gate would have been |10〉 instead of |11〉had we started with the superposition of states

1√2|10〉+ 1√

2|11〉,(2.6)

which has the same probabilities of being in any particular configuration if it is ob-served as does the superposition (2.4).

If we apply a gate to only two bits of a longer vector (now our circuit must havemore than two wires), for each basis vector we apply the transformation given by thegate’s truth table to the two bits on which the gate is operating, and leave the otherbits alone. This corresponds to multiplying the whole state by the tensor product ofthe gate matrix on those two bits with the identity matrix on the remaining bits. Forexample, applying the transformation represented by (2.2) and (2.3) to the first twobits of the basis vector |110〉 yields the vector 1√

2(|100〉 − |110〉).

A quantum gate array is a set of quantum gates with logical “wires” connectingtheir inputs and outputs. The input to the gate array, possibly along with extrawork bits that are initially set to 0, is fed through a sequence of quantum gates. Thevalues of the bits are observed after the last quantum gate, and these values are theoutput. This model is analogous to classical acyclic circuits in theoretical computerscience, and was previously studied by Yao [1993]. As in the classical case, in orderto compare quantum gate arrays with quantum Turing machines, we need to makethe gate arrays a uniform complexity class. In other words, because we use a differentgate array for each size of input, we need to keep the designer of the gate arrays fromhiding noncomputable (or hard to compute) information in the arrangement of thegates. To make quantum gate arrays uniform, we must add two requirements to thedefinition of gate arrays. The first is the standard uniformity requirement that thedesign of the gate array be produced by a polynomial-time (classical) computation.The second uniformity requirement should be a standard part of the definition ofanalog complexity classes; however, since analog complexity classes have not beenas widely studied, this requirement is not well known. The requirement is that theentries in the unitary matrices describing the gates must be computable numbers.Specifically, the first logn bits of each entry should be classically computable in timepolynomial in n [Solovay 1995]. This keeps noncomputable (or hard to compute)information from being hidden in the bits of the amplitudes of the quantum gates.

3. Reversible logic and modular exponentiation. The definition of quan-tum gate arrays gives rise to completely reversible computation. That is, knowing thequantum state on the wires leading out of a gate tells uniquely what the quantumstate must have been on the wires leading into that gate. This is a reflection of thefact that, despite the macroscopic arrow of time, the laws of physics appear to becompletely reversible. This would seem to imply that anything built with the lawsof physics must be completely reversible; however, classical computers get aroundthis by dissipating energy and thus making their computations thermodynamicallyirreversible. This appears impossible to do for quantum computers because superpo-sitions of quantum states need to be maintained throughout the computation. Thus,quantum computers necessarily have to use reversible computation. This imposes

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 8: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1491

Table 3.1

Truth tables for Toffoli and Fredkin gates.

Toffoli gate

INPUT OUTPUT0 0 0 0 0 00 0 1 0 0 10 1 0 0 1 00 1 1 0 1 11 0 0 1 0 01 0 1 1 0 11 1 0 1 1 11 1 1 1 1 0

Fredkin gate

INPUT OUTPUT0 0 0 0 0 00 0 1 0 1 00 1 0 0 0 10 1 1 0 1 11 0 0 1 0 01 0 1 1 0 11 1 0 1 1 01 1 1 1 1 1

extra costs when doing classical computations on a quantum computer, which can benecessary in subroutines of quantum computations.

Because of the reversibility of quantum computation, a deterministic computationis performable on a quantum computer only if it is reversible. Luckily, it has alreadybeen shown that any deterministic computation can be made reversible [Lecerf 1963;Bennett 1973]. In fact, reversible classical gate arrays (or reversible acyclic circuits)have been studied. Much like the result that any classical computation can be doneusing NAND gates, there are also universal gates for reversible computation. Twoof these are Toffoli gates [Toffoli 1980] and Fredkin gates [Fredkin and Toffoli 1982];these are illustrated in Table 3.1.

The Toffoli gate is just a doubly controlled NOT, i.e., the last bit is negated ifand only if the first two bits are 1. In a Toffoli gate, if the third input bit is set to 1,then the third output bit is the NAND of the first two input bits. Since NAND is auniversal gate for classical gate arrays, this shows that the Toffoli gate is universal.In a Fredkin gate, the last two bits are swapped if the first bit is 0, and left untouchedif the first bit is 1. For a Fredkin gate, if the third input bit is set to 0, the secondoutput bit is the AND of the first two input bits; and if the last two input bits areset to 0 and 1 respectively, the second output bit is the NOT of the first input bit.Thus, both AND and NOT gates are realizable using Fredkin gates, showing that theFredkin gate is universal.

From results on reversible computation [Lecerf 1963, Bennett 1973], we can effi-ciently compute any polynomial time function F (x) as long as we keep the input xin the computer. We do this by adapting the method for computing the function Fnonreversibly. These results can easily be extended to work for gate arrays [Toffoli1980; Fredkin and Toffoli 1982]. When AND, OR, or NOT gates are changed to Fred-kin or Toffoli gates, one obtains both additional input bits, which must be preset tospecified values, and additional output bits, which contain the information needed toreverse the computation. While the additional input bits do not present difficultiesin designing quantum computers, the additional output bits do, because unless theyare all reset to 0, they will affect the interference patterns in quantum computation.Bennett’s method for resetting these bits to 0 is shown in the top half of Table 3.2.A nonreversible gate array may thus be turned into a reversible gate array as fol-lows. First, duplicate the input bits as many times as necessary (since each inputbit could be used more than once by the gate array). Next, keeping one copy of theinput around, use Toffoli and Fredkin gates to simulate nonreversible gates, puttingthe extra output bits into the RECORD register. These extra output bits preserveenough of a record of the operations to enable the computation of the gate array to be

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 9: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1492 PETER W. SHOR

Table 3.2

Bennett’s method for making a computation reversible.

INPUT - - - - - - - - - - - - - - - - - -INPUT OUTPUT RECORD(F ) - - - - - -INPUT OUTPUT RECORD(F ) OUTPUTINPUT - - - - - - - - - - - - OUTPUTINPUT INPUT RECORD(F−1) OUTPUT- - - - - - INPUT RECORD(F−1) OUTPUT- - - - - - - - - - - - - - - - - - OUTPUT

reversed. Once the output F (x) has been computed, copy it into a register that hasbeen preset to zero, and then undo the computation to erase both the first OUTPUTregister and the RECORD register.

To erase x and replace it with F (x), in addition to a polynomial-time algorithmfor F , we also need a polynomial-time algorithm for computing x from F (x); i.e., weneed that F is one-to-one and that both F and F−1 are polynomial-time computable.The method for this computation is given in the whole of Table 3.2. There are twostages to this computation. The first is the same as before, taking x to (x, F (x)).For the second stage, shown in the bottom half of Table 3.2, note that if we havea method to compute F−1 nonreversibly in polynomial time, we can use the sametechnique to reversibly map F (x) to (F (x), F−1(F (x))) = (F (x), x). However, sincethis is a reversible computation, we can reverse it to go from (x, F (x)) to F (x). Puttogether, these two stages take x to F (x).

The above discussion shows that computations can be made reversible for onlya constant factor cost in time, but the above method uses as much space as it doestime. If the classical computation requires much less space than time, then making itreversible in this manner will result in a large increase in the space required. Thereare methods that do not use as much space, but use more time, to make computa-tions reversible [Bennett 1989, Levine and Sherman 1990]. While there is no generalmethod that does not cause an increase in either space or time, specific algorithms cansometimes be made reversible without paying a large penalty in either space or time;at the end of this section we will show how to do this for modular exponentiation,which is a subroutine necessary for quantum factoring.

The bottleneck in the quantum factoring algorithm—i.e., the piece of the factoringalgorithm that consumes the most time and space—is modular exponentiation. Themodular exponentiation problem is, given n, x, and r, find xr (mod n). The best

classical method for doing this is to repeatedly square of x (mod n) to get x2i (mod n)for i ≤ log2 r, and then multiply a subset of these powers (mod n) to get xr (mod n).If we are working with l-bit numbers, this requires O(l) squarings and multiplicationsof l-bit numbers (mod n). Asymptotically, the best classical result for gate arraysfor multiplication is the Schonhage–Strassen algorithm [Schonhage and Strassen 1971,Knuth 1981, Schonhage 1982]. This gives a gate array for integer multiplication thatuses O(l log l log log l) gates to multiply two l-bit numbers. Thus, asymptotically,modular exponentiation requiresO(l2 log l log log l) time. Making this reversible wouldnaıvely cost the same amount in space; however, one can reuse the space used in therepeated squaring part of the algorithm, and thus reduce the amount of space neededto essentially that required for multiplying two l-bit numbers; one simple method forreducing this space (although not the most versatile one) will be given later in thissection. Thus, modular exponentiation can be done in O(l2 log l log log l) time andO(l log l log log l) space.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 10: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1493

While the Schonhage–Strassen algorithm is the best multiplication algorithm dis-covered to date for large l, it does not scale well for small l. For small numbers, thebest gate arrays for multiplication essentially use elementary-school longhand multi-plication in binary. This method requires O(l2) time to multiply two l-bit numbers,and thus modular exponentiation requires O(l3) time with this method. These gatearrays can be made reversible, however, using only O(l) space.

We now give the method for constructing a reversible gate array that takes onlyO(l) space and O(l3) time to compute (a, xa (mod n)) from a, where a, x, and nare l-bit numbers and x and n are relatively prime. This case, where x and n arerelatively prime, is sufficient for our factoring and discrete logarithm algorithms. Adetailed analysis of essentially this method, giving an exact number of quantum gatessufficient for factoring, was performed by Beckman et al. [1996].

The basic building block used is a gate array that takes b as input and outputsb + c (mod n). Note that here b is the gate array’s input but c and n are built intothe structure of the gate array. Since addition (mod n) is computable in O(logn)time classically, this reversible gate array can be made with only O(logn) gates andO(logn) work bits using the techniques explained earlier in this section.

The technique we use for computing xa (mod n) is essentially the same as the

classical method. First, by repeated squaring we compute x2i (mod n) for all i < l.

Then, to obtain xa (mod n) we multiply the powers x2i (mod n), where 2i appears inthe binary expansion of a. In our algorithm for factoring n, we need only to computexa (mod n), where a is in a superposition of states but x is some fixed integer. Thismakes things much easier, because we can use a reversible gate array where a is inputbut where x and n are built into the structure of the gate array. Thus, we can usethe algorithm described by the following pseudocode; here ai represents the ith bit ofa in binary, where the bits are indexed from right to left and the rightmost bit of ais a0.

power := 1

for i = 0 to l−1if ( ai == 1 ) then

power := power ∗ x 2i (mod n)endif

endfor

The variable a is left unchanged by the code and xa (mod n) is output as the variablepower . Thus, this code takes the pair of values (a, 1) to (a, xa (mod n)).

This pseudocode can easily be turned into a gate array; the only hard part of thisis the fourth line, where we multiply the variable power by x2i (mod n); to do this we

need to use a fairly complicated gate array as a subroutine. Recall that x2i (mod n)can be computed classically and then built into the structure of the gate array. Thus,to implement this line, we need a reversible gate array that takes b as input andgives bc (mod n) as output, where the structure of the gate array can depend on cand n. Of course, this step can only be reversible if gcd(c, n) = 1—i.e., if c and nhave no common factors—as otherwise two distinct values of b will be mapped to thesame value of bc (mod n). Fortunately, x and n being relatively prime in modularexponentiation implies that c and n are relatively prime in this subroutine.

We will show how to build this gate array in two stages. The first stage is directlyanalogous to exponentiation by repeated multiplication; we obtain multiplication fromrepeated addition (mod n). Pseudocode for this stage is as follows.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 11: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1494 PETER W. SHOR

result := 0

for i = 0 to l−1if ( bi == 1 ) then

result := result + 2ic (mod n)endif

endfor

Again, 2ic (mod n) can be precomputed and built into the structure of the gate array.The above pseudocode takes b as input and gives (b, bc (mod n)) as output. To

get the desired result, we now need to erase b. Recall that gcd(c, n) = 1, so there isa c−1 (mod n) with c c−1 ≡ 1 (mod n). Multiplication by this c−1 could be used toreversibly take bc (mod n) to (bc (mod n), bcc−1 (mod n)) = (bc (mod n), b). This isjust the reverse of the operation we want, and since we are working with reversiblecomputing, we can turn this operation around to erase b. The pseudocode for thisfollows.

for i = 0 to l−1if ( result i == 1 ) then

b := b − 2ic−1 (mod n)endif

endfor

As before, result i is the ith bit of result.Note that at this stage of the computation, b should be 0. However, we did not set

b directly to zero, as this would not have been a reversible operation and thus impos-sible on a quantum computer, but instead we did a relatively complicated sequenceof operations which ended with b = 0 and which in fact depended on multiplicationbeing a group (mod n). At this point, then, we could do something somewhat sneaky:we could measure b to see if it actually is 0. If it is not, we know that there has been anerror somewhere in the quantum computation, i.e., that the results are worthless andwe should stop the computer and start over again. However, if we do find that b is 0,then we know (because we just observed it) that it is now exactly 0. This measurementthus may bring the quantum computation back on track in that any amplitude thatb had for being nonzero has been eliminated. Further, because the probability thatwe observe a state is proportional to the square of the amplitude of that state, doingthe modular exponentiation and measuring b every time that we know it should be 0may result in a higher probability of overall success than the same computation donewithout the repeated measurements of b. This is the quantum watchdog (or quantumZeno) effect [Peres 1993], and whether it is applicable in this setting depends on theerror model for our quantum circuits. The argument above does not actually showthat repeated measurement of b is indeed beneficial, because there is a cost (in time,if nothing else) of measuring b. Before this is implemented, then, it should be checkedwith analysis or experiment that the benefit of such measurements exceeds their cost.In general, I believe that partial measurements such as this one are a promising wayof trying to stabilize quantum computations.

Currently, Schonhage–Strassen is the algorithm of choice for multiplying verylarge numbers, and longhand multiplication is the algorithm of choice for small num-bers. There are also multiplication algorithms which have asymptotic efficiencies be-tween these two algorithms and which are superior for intermediate length numbers[Karatsuba and Ofman 1962; Knuth 1981; Schonhage, Grotefeld, and Vetter 1994]. Itis not clear which algorithms are best for which size numbers. While this is known to

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 12: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1495

some extent for classical computation [Schonhage, Grotefeld, and Vetter 1994], usingdata on which algorithms work better on classical computers could be misleading fortwo reasons. First, classical computers need not be reversible, and the cost of makingan algorithm reversible depends on the algorithm. Second, existing computers gener-ally have multiplication for 32- or 64-bit numbers built into their hardware, and thistends to increase the optimal changeover points. To further confuse matters, somemultiplication algorithms can take better advantage of hardwired multiplication thanothers. In order to program quantum computers most efficiently, work thus needsto be done on the best way of implementing elementary arithmetic operations onquantum computers. One tantalizing fact is that the Schonhage–Strassen fast multi-plication algorithm uses the fast Fourier transform, which is also the basis for all thefast algorithms on quantum computers discovered to date. It is thus tempting to spec-ulate that integer multiplication itself might be speeded up by a quantum algorithm;if possible, this would result in a somewhat faster asymptotic bound for factoringon a quantum computer, and indeed could even make breaking RSA on a quantumcomputer asymptotically faster than encrypting with RSA on a classical computer.

4. Quantum Fourier transforms. Since quantum computation deals with uni-tary transformations, it is helpful to know how to build certain useful unitary trans-formations. In this section we give a technique for constructing in polynomial timeon quantum computers one particular unitary transformation, which is essentially adiscrete Fourier transform. This transformation will be given as a matrix, with bothrows and columns indexed by states. These states correspond to binary representa-tions of integers on the computer; in particular, the rows and columns will be indexedbeginning with 0 unless otherwise specified.

This transformation is as follows. Consider a number a with 0 ≤ a < q for someq. We will perform the transformation that takes the state |a〉 to the state

1

q1/2

q−1∑c=0

|c〉 exp(2πiac/q).(4.1)

That is, we apply the unitary matrix whose (a, c) entry is 1q1/2

exp(2πiac/q). This

Fourier transform is at the heart of our algorithms, and we call this matrix Aq.In our factoring and discrete logarithm algorithms, we will use Aq for q of expo-

nential size (i.e., the number of bits of q grows polynomially with the length of ourinput). We must thus show how this transformation can be done in time polynomialin the number of bits of q. In this paper, we give a simple construction for Aq when qis a power of 2 that was discovered independently by Coppersmith [1994] and Deutsch[see Ekert and Jozsa 1996]. This construction is essentially the standard fast Fouriertransform (FFT) algorithm [Knuth 1981] adapted for a quantum computer; the fol-lowing description of it follows that of Ekert and Jozsa [1996]. In the earlier versionof this paper [Shor 1994], we gave a construction for Aq when q was in the specialclass of smooth numbers having only small prime power factors. In fact, Cleve [1994]has shown how to construct Aq for all smooth numbers q whose prime factors are atmost O(logn).

Take q = 2l, and let us represent an integer a in binary as |al−1al−2 . . . a0〉. Forthe quantum Fourier transform Aq, we need only to use two types of quantum gates.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 13: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1496 PETER W. SHOR

These gates are Rj , which operates on the jth bit of the quantum computer:

Rj =

|0〉 |1〉|0〉 1√

21√2

|1〉 1√2

− 1√2

,

(4.2)

and Sj,k, which operates on the bits in positions j and k with j < k:

Sj,k =

|00〉 |01〉 |10〉 |11〉|00〉 1 0 0 0

|01〉 0 1 0 0

|10〉 0 0 1 0

|11〉 0 0 0 eiθk−j ,

(4.3)

where θk−j = π/2k−j . To perform a quantum Fourier transform, we apply the matri-ces in the order (from left to right)

Rl−1 Sl−2,l−1 Rl−2 Sl−3,l−1 Sl−3,l−2 Rl−3 . . . R1 S0,l−1 S0,l−2 . . . S0,2 S0,1 R0;(4.4)

that is, we apply the gates Rj in reverse order from Rl−1 to R0, and between Rj+1

and Rj we apply all the gates Sj,k where k > j. For example, on 3 bits, the matriceswould be applied in the order R2S1,2R1S0,2S0,1R0. To take the Fourier transform Aq

when q = 2l, we thus need to use l(l − 1)/2 quantum gates.Applying this sequence of transformations will result in a quantum state

1q1/2

∑b exp(2πiac/q)|b〉, where b is the bit-reversal of c, i.e., the binary number ob-

tained by reading the bits of c from right to left. Thus, to obtain the actual quantumFourier transform, we need either to do further computation to reverse the bits of |b〉to obtain |c〉, or to leave these bits in place and read them in reverse order; eitheralternative is easy to implement.

To show that this operation actually performs a quantum Fourier transform,consider the amplitude of going from |a〉 = |al−1 . . . a0〉 to |b〉 = |bl−1 . . . b0〉. First,the factors of 1/

√2 in the R matrices multiply to produce a factor of 1/q1/2 overall;

thus we need only worry about the exp(2πiac/q) phase factor in the expression (4.1).The matrices Sj,k do not change the values of any bits, but merely change their phases.There is thus only one way to switch the jth bit from aj to bj , and that is to usethe appropriate entry in the matrix Rj . This entry adds π to the phase if the bits ajand bj are both 1, and leaves it unchanged otherwise. Further, the matrix Sj,k addsπ/2k−j to the phase if aj and bk are both 1 and leaves it unchanged otherwise. Thus,the phase on the path from |a〉 to |b〉 is∑

0≤j<l

πajbj +∑

0≤j<k<l

π

2k−jajbk.(4.5)

This expression can be rewritten as∑0≤j≤k<l

π

2k−jajbk.(4.6)

Since c is the bit-reversal of b, this expression can be further rewritten as∑0≤j≤k<l

π

2k−jajcl−1−k.(4.7)

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 14: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1497

Making the substitution l − k − 1 for k in this sum, we obtain

∑0≤j+k<l

2π2j2k

2lajck.(4.8)

Now, since adding multiples of 2π do not affect the phase, we obtain the same phaseif we sum over all j and k less than l, obtaining

l−1∑j,k=0

2π2j2k

2lajck =

2l

l−1∑j=0

2jaj

l−1∑k=0

2kck,(4.9)

where the last equality follows from the distributive law of multiplication. Now, q = 2l

and

a =l−1∑j=0

2jaj , c =l−1∑k=0

2kck,(4.10)

so the expression (4.9) is equal to 2πac/q, which is the phase for the amplitude|a〉 → |c〉 in the transformation (4.1).

When k − j is large in the gate Sj,k in (4.3), we are multiplying by a very smallphase factor. This would be very difficult to do accurately physically, and thus it wouldbe somewhat disturbing if this were necessary for quantum computation. In fact, Cop-persmith [1994] has shown that one can use an approximate Fourier transform thatignores these tiny phase factors but which approximates the Fourier transform closelyenough that it can also be used for factoring. In fact, this technique reduces the num-ber of quantum gates needed for the (approximate) Fourier transform considerably,as it leaves out most of the gates Sj,k.

Recently, Griffiths and Niu [1996] have shown that this Fourier transform can becarried out using only one-bit gates and measurements of single bits. Both of theseoperations are potentially easier to implement in a physical system than two-bit gates.The use of two-bit gates, however, is still required during the modular exponentiationstep of the factoring and discrete logarithm algorithms.

5. Prime factorization. It has been known since before Euclid that everyinteger n is uniquely decomposable into a product of primes. For nearly as long,mathematicians have been interested in the question of how to factor a number intothis product of primes. It was only in the 1970s, however, that researchers appliedthe paradigms of theoretical computer science to number theory, and looked at theasymptotic running times of factoring algorithms [Adleman 1994]. This has resultedin a great improvement in the efficiency of factoring algorithms. Currently the bestfactoring algorithm, both asymptotically and in practice, is the number field sieve[Lenstra et al. 1990, Lenstra and Lenstra 1993], which in order to factor an integer ntakes asymptotic running time exp(c(logn)1/3(log log n)2/3) for some constant c. Sincethe input n is only logn bits in length, this algorithm is an exponential-time algo-rithm. Our quantum factoring algorithm takes asymptotically O((logn)2(log log n)(log log log n)) steps on a quantum computer, along with a polynomial (in logn)amount of post-processing time on a classical computer that is used to convert theoutput of the quantum computer to factors of n. While this post-processing could inprinciple be done on a quantum computer, there is no reason not to use a classicalcomputer for this step.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 15: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1498 PETER W. SHOR

Instead of giving a quantum computer algorithm for factoring n directly, we givea quantum computer algorithm for finding the order r of an element x in the multi-plicative group (mod n); that is, the least integer r such that xr ≡ 1 (mod n). It isknown that using randomization, factorization can be reduced to finding the order ofan element [Miller 1976]; we now briefly give this reduction.

To find a factor of an odd number n, given a method for computing the orderr of x, choose a random x (mod n), find its order r, and compute gcd(xr/2 − 1, n).Here, gcd(a, b) is the greatest common divisor of a and b, i.e., the largest integer thatdivides both a and b. The Euclidean algorithm [Knuth 1981] can be used to computegcd(a, b) in polynomial time. Since (xr/2 − 1)(xr/2 + 1) = xr − 1 ≡ 0 (mod n), thenumbers gcd(xr/2 +1, n) and gcd(xr/2−1, n) will be two factors of n. This procedurefails only if r is odd, in which case r/2 is not integral, or if xr/2 ≡ −1 (mod n), inwhich case the procedure yields the trivial factors 1 and n. Using this criterion, itcan be shown that this procedure, when applied to a random x (mod n), yields anontrivial factor of n with probability at least 1− 1/2k−1, where k is the number ofdistinct odd prime factors of n. A brief sketch of the proof of this result follows.

Suppose that n =∏k

i=1 pαii is the prime factorization of n. Let ri be the order of

x (mod pαii ). Then r is the least common multiple of all the ri. Consider the largestpower of 2 dividing each ri. The algorithm only fails if all of these powers of 2 agree:if they are all 1, then r is odd and r/2 does not exist; if they are all equal and largerthan 1, then xr/2 ≡ −1 (mod pαii ) for every i, so xr/2 ≡ −1 (mod n). By the Chineseremainder theorem [Knuth 1981; Hardy and Wright 1979, Theorem 121], choosingan x (mod n) at random is the same as choosing for each i a number xi (mod pαii )at random, where x ≡ xi (mod pαii ). The multiplicative group (mod pα) for any oddprime power pα is cyclic [Knuth 1981], so for the odd prime power pαii , the probabilityis at most 1/2 of choosing an xi having a particular power of 2 as the largest divisor ofits order ri. Thus each of these powers of 2 has at most a 50% probability of agreeingwith the previous ones, so all k of them agree with probability at most 1/2k−1. Thereis thus at least a 1− 1/2k−1 probability that the x we choose is good. This argumentshows the scheme will work as long as n is odd and not a prime power; finding a factorof even numbers and of prime powers can be done efficiently with classical methods.

We now describe the algorithm for finding the order of x (mod n) on a quantumcomputer. This algorithm will use two quantum registers which hold integers repre-sented in binary. There will also be some amount of workspace. This workspace getsreset to 0 after each subroutine of our algorithm, so we will not include it when wewrite down the state of our machine.

Given x and n, to find the order of x, i.e., the least r such that xr ≡ 1 (mod n),we do the following. First, we find q, the power of 2 with n2 ≤ q < 2n2. We will notinclude n, x, or q when we write down the state of our machine, because we neverchange these values. In a quantum gate array we need not even keep these values inmemory, as they can be built into the structure of the gate array.

Next, we put the first register in the uniform superposition of states representingnumbers a (mod q). This leaves our machine in state

1

q1/2

q−1∑a=0

|a〉|0〉.(5.1)

This step is relatively easy, since all it entails is putting each bit in the first registerinto the superposition 1√

2(|0〉+ |1〉).

Next, we compute xa (mod n) in the second register as described in section 3.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 16: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1499

Since we keep a in the first register this can be done reversibly. This leaves ourmachine in the state

1

q1/2

q−1∑a=0

|a〉|xa (mod n)〉.(5.2)

We then perform our Fourier transform Aq on the first register, as describedin section 4, mapping |a〉 to

1

q1/2

q−1∑c=0

exp(2πiac/q)|c〉.(5.3)

That is, we apply the unitary matrix with the (a, c) entry equal to 1q1/2

exp(2πiac/q).

This leaves our machine in state

1

q

q−1∑a=0

q−1∑c=0

exp(2πiac/q)|c〉|xa (mod n)〉.(5.4)

Finally, we observe the machine. It would be sufficient to observe solely the valueof |c〉 in the first register, but for clarity we will assume that we observe both |c〉 and|xa (mod n)〉. We now compute the probability that our machine ends in a particularstate |c, xk (mod n)〉, where we may assume 0 ≤ k < r. Summing over all possibleways to reach the state |c, xk (mod n)〉, we find that this probability is∣∣∣∣∣1q ∑

a: xa≡xk

exp(2πiac/q)

∣∣∣∣∣2

,(5.5)

where the sum is over all a, 0 ≤ a < q, such that xa ≡ xk (mod n). Because the orderof x is r, this sum is over all a satisfying a ≡ k (mod r). Writing a = br + k, we findthat the above probability is∣∣∣∣∣∣1q

b(q−k−1)/rc∑b=0

exp(2πi(br + k)c/q)

∣∣∣∣∣∣2

.(5.6)

We can ignore the term of exp(2πikc/q), as it can be factored out of the sum and hasmagnitude 1. We can also replace rc with {rc}q, where {rc}q is the residue which iscongruent to rc (mod q) and is in the range −q/2 < {rc}q ≤ q/2. This leaves us withthe expression ∣∣∣∣∣∣1q

b(q−k−1)/rc∑b=0

exp(2πib{rc}q/q)∣∣∣∣∣∣2

.(5.7)

We will now show that if {rc}q is small enough, all the amplitudes in this sum willbe in nearly the same direction, i.e., have close to the same phase, and thus make thesum large. Turning the sum into an integral, we obtain

(5.8)

1

q

∫ b(q−k−1)/rc

0

exp(2πib{rc}q/q)db + O

(b(q − k − 1)/rcq

(exp(2πi{rc}q/q)− 1)

).

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 17: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1500 PETER W. SHOR

If |{rc}q| ≤ r/2, the error term in the above expression is easily seen to be boundedby O(1/q). We now show that if |{rc}q| ≤ r/2, the above integral is large, so theprobability of obtaining a state |c, xk (mod n)〉 is large. Note that this conditiondepends only on c and is independent of k. Substituting u = rb/q in the aboveintegral, we get

1

r

∫ b(q−k−1)/rcr/q

0

exp

(2πi

{rc}qr

u

)du.(5.9)

Since k < r, approximating the upper limit of integration by 1 results in only a O(1/q)error in the above expression. If we do this, we obtain the integral

1

r

∫ 1

0

exp

(2πi

{rc}qr

u

)du.(5.10)

Letting {rc}q/r vary between − 12 and 1

2 , the absolute magnitude of the integral (5.10)is easily seen to be minimized when {rc}q/r = ± 1

2 , in which case the absolute valueof expression (5.10) is 2/(πr). The square of this quantity is a lower bound on theprobability that we see any particular state |c, xk (mod n)〉 with {rc}q ≤ r/2; thisprobability is thus asymptotically bounded below by 4/(π2r2), and so is at least 1/3r2

for sufficiently large n.The probability of seeing a given state |c, xk (mod n)〉 will thus be at least 1/3r2

if

−r2

≤ {rc}q ≤ r

2,(5.11)

i.e., if there is a d such that

−r2

≤ rc− dq ≤ r

2.(5.12)

Dividing by rq and rearranging the terms give∣∣∣∣ cq − d

r

∣∣∣∣ ≤ 1

2q.(5.13)

We know c and q. Because q > n2, there is at most one fraction d/r with r < n thatsatisfies the above inequality. Thus, we can obtain the fraction d/r in lowest termsby rounding c/q to the nearest fraction having a denominator smaller than n. Thisfraction can be found in polynomial time by using a continued fraction expansion ofc/q, which finds all the best approximations of c/q by fractions [Knuth 1981; Hardyand Wright 1979, Chapter X].

The exact probabilities as given by (5.7) for an example case with r = 10 andq = 256 are plotted in Fig. 5.1. For example, the value r = 10 could occur whenfactoring 33 if x were chosen to be 5. Here q is taken smaller than 332 so as tomake the values of c in the plot distinguishable; this does not change the functionalstructure of P(c). Note that with high probability the observed value of c is near anintegral multiple of q/r = 256/10.

If we have the fraction d/r in lowest terms, and if d happens to be relatively primeto r, this will give us r. We will now count the number of states |c, xk (mod n)〉 whichenable us to compute r in this way. There are φ(r) possible values of d relativelyprime to r, where φ is Euler’s totient function [Knuth 1981; Hardy and Wright 1979,

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 18: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1501

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0 32 64 96 128 160 192 224 256

P

c

Fig. 5.1. The probability P of observing values of c between 0 and 255, given q = 256 and r = 10.

section 5.5]. Each of these fractions d/r is close to one fraction c/q with |c/q−d/r| ≤1/2q. There are also r possible values for xk, since r is the order of x. Thus, there arerφ(r) states |c, xk (mod n)〉 which would enable us to obtain r. Since each of thesestates occurs with probability at least 1/3r2, we obtain r with probability at leastφ(r)/3r. Using the theorem that φ(r)/r > δ/ log log r for some constant δ [Hardy andWright 1979, Theorem 328], this shows that we find r at least a δ/ log log r fractionof the time, so by repeating this experiment only O(log log r) times, we are assuredof a high probability of success.

In practice, assuming that quantum computation is more expensive than classicalcomputation, it would be worthwhile to alter the above algorithm so as to performless quantum computation and more postprocessing. First, if the observed state is|c〉, it would be wise to also try numbers close to c such as c± 1, c± 2, . . . , since thesealso have a reasonable chance of being close to a fraction qd/r. Second, if c/q ≈ d/rwhere d and r have a common factor, this factor is likely to be small. Thus, if theobserved value of c/q is rounded off to d′/r′ in lowest terms, for a candidate r oneshould consider not only r′ but also its small multiples 2r′, 3r′, . . . , to see if theseare the actual order of x. Although the first technique will only reduce the expectednumber of trials required to find r by a constant factor, the second technique willreduce the expected number of trials for the hardest n from O(log log n) to O(1) ifthe first (logn)1+ε multiples of r′ are considered [Odylzko 1995]. A third technique,if two candidates for r—say, r1 and r2—have been found, is to test the least commonmultiple of r1 and r2 as a candidate r. This third technique is also able to reducethe expected number of trials to a constant [Knill 1995] and will work in some caseswhere the first two techniques fail.

Note that in this algorithm for determining the order of an element, we did not usemany of the properties of multiplication (mod n). In fact, if we have a permutationf mapping the set {0, 1, 2, . . . , n − 1} into itself such that its kth iterate, f (k)(a), iscomputable in time polynomial in logn and log k, the same algorithm will be able tofind the order of an element a under f , i.e., the minimum r such that f (r)(a) = a.

6. Discrete logarithms. For every prime p, the multiplicative group (mod p)is cyclic, that is, there are generators g such that 1, g, g2, . . . , gp−2 comprise all thenonzero residues (mod p) [Hardy and Wright 1979, Theorem 111; Knuth 1981]. Sup-pose that we are given a prime p and such a generator g. The discrete logarithm of

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 19: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1502 PETER W. SHOR

a number x with respect to p and g is the integer r with 0 ≤ r < p − 1 such thatgr ≡ x (mod p). The fastest algorithm known for finding discrete logarithms moduloarbitrary primes p is Gordon’s [1993] adaptation of the number field sieve, which runsin time exp(O(log p)1/3(log log p)2/3). We show how to find discrete logarithms ona quantum computer using two modular exponentiations and two quantum Fouriertransforms.

This algorithm will use three quantum registers. We first find q a power of 2 suchthat q is close to p, i.e., with p < q < 2p. Next, we put the first two registers inour quantum computer in the uniform superposition of all |a〉 and |b〉 (mod p − 1).One way to do this in quantum polynomial time is to put the register in a uniformsuperposition of all the numbers from 0 to q − 1, use quantum computation to testwhether the number is less than p, and restart the algorithm if the results of this testare unfavorable. We next compute gax−b (mod p) in the third register. This leavesour machine in the state

1

p− 1

p−2∑a=0

p−2∑b=0

|a, b, gax−b (mod p)〉.(6.1)

As before, we use the Fourier transform Aq to send |a〉 → |c〉 and |b〉 → |d〉 withprobability amplitude 1

q exp(2πi(ac + bd)/q). That is, we take the state |a, b〉 to thestate

1

q

q−1∑c=0

q−1∑d=0

exp

(2πi

q(ac + bd)

)|c, d〉.(6.2)

This leaves our quantum computer in the state

1

(p− 1)q

p−2∑a,b=0

q−1∑c,d=0

exp

(2πi

q(ac + bd)

)|c, d, gax−b (mod p)〉.(6.3)

Finally, we observe the state of the quantum computer.The probability of observing a state |c, d, y〉 with y ≡ gk (mod p) is∣∣∣∣∣∣∣

1

(p− 1)q

∑a,b

a−rb≡k

exp

(2πi

q(ac + bd)

) ∣∣∣∣∣∣∣2

,(6.4)

where the sum is over all (a, b) such that a− rb ≡ k (mod p− 1). Note that we nowhave two moduli to deal with, p− 1 and q. While this makes keeping track of thingsmore confusing, it does not pose serious problems. We now use the relation

a = br + k − (p− 1)

⌊br + k

p− 1

⌋(6.5)

and substitute (6.5) in the expression (6.4) to obtain the amplitude on|c, d, gk (mod p)〉, which is

1

(p− 1)q

p−2∑b=0

exp

(2πi

q

(brc + kc + bd− c(p− 1)

⌊br + k

p− 1

⌋)).(6.6)

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 20: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1503

The square of the absolute value of this amplitude is the probability of observingthe state |c, d, gk (mod p)〉. We will now analyze the expression (6.6). First, a factorof exp(2πikc/q) can be taken out of all the terms and ignored, because it does notchange the probability. Next, we split the exponent into two parts and factor out bto obtain

1

(p− 1)q

p−2∑b=0

exp

(2πi

qbT

)exp

(2πi

qV

),(6.7)

where

T = rc + d− r

p− 1{c(p− 1)}q,(6.8)

and

V =

(br

p− 1−⌊br + k

p− 1

⌋){c(p− 1)}q.(6.9)

Here by {z}q we mean the residue of z (mod q) with −q/2 < {z}q ≤ q/2, as in (5.7).We now classify possible outputs (observed states) of the quantum computer as

“good” or “bad.” We will show that if we get enough good outputs, then we willlikely be able to deduce r, and that furthermore, the chance of getting a good outputis constant. The idea is that if∣∣{T}q∣∣ = ∣∣rc + d− r

p− 1{c(p− 1)}q − jq

∣∣ ≤ 1

2,(6.10)

where j is the closest integer to T/q, then as b varies between 0 and p− 2, the phaseof the first exponential term in (6.7) only varies over at most half of the unit circle.Further, if

|{c(p− 1)}q| ≤ q/12,(6.11)

then |V | is always at most q/12, so the phase of the second exponential term in(6.7) never is farther than exp(πi/6) from 1. If conditions (6.10) and (6.11) bothhold, we will say that an output is good. We will show that if both conditions hold,then the contribution to the probability from the corresponding term is significant.Furthermore, both conditions will hold with constant probability, and a reasonablesample of c’s for which condition (6.10) holds will allow us to deduce r.

We now give a lower bound on the probability of each good output, i.e., an outputthat satisfies conditions (6.10) and (6.11). We know that as b ranges from 0 to p− 2,the phase of exp(2πibT/q) ranges from 0 to 2πiW where

W =p− 2

q

(rc + d− r

p− 1{c(p− 1)}q − jq

).(6.12)

It follows from (6.10) that |W | ≤ (p − 2)/(2q) ≤ 1/2. Thus, the component of theamplitude of the first exponential in the summand of (6.7) in the direction

exp (πiW )(6.13)

is at least cos(2π(W/2−Wb/(p− 2))), where 2π(W/2−Wb/(p− 2)) is between −π2

and π2 . By condition (6.11), the phase can vary by at most πi/6 due to the second

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 21: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1504 PETER W. SHOR

exponential exp(2πiV/q). Applying this variation in the manner that minimizes thecomponent in the direction (6.13), we get that the component in this direction is atleast

cos(2π |W/2−Wb/(p− 2)|+ π

6

).(6.14)

Thus we get that the absolute value of the amplitude (6.7) is at least

1

(p− 1)q

p−2∑b=0

cos(2π |W/2−Wb/(p− 2)|+ π

6

).(6.15)

Replacing this sum with an integral, we get that the absolute value of this amplitudeis at least

2

q

∫ 1/2

0

cos(π

6+ 2π|W |u

)du + O

(W

pq

).(6.16)

From condition (6.10), |W | ≤ 12 , so the error term is O( 1

pq ). As W varies between

− 12 and 1

2 , the integral (6.16) is minimized when |W | = 12 . Thus, the probability of

arriving at a state |c, d, y〉 that satisfies both conditions (6.10) and (6.11) is at least(1

q

2

π

∫ 2π/3

π/6

cosu du

)2

,(6.17)

or at least .054/q2 > 1/(20q2).We will now count the number of pairs (c, d) satisfying conditions (6.10)

and (6.11). The number of pairs (c, d) such that (6.10) holds is exactly the num-ber of possible c’s, since for every c there is exactly one d such that (6.10) holds.Unless gcd(p−1, q) is large, the number of c’s for which (6.11) holds is approximatelyq/6, and even if it is large, this number is at least q/12. Thus, there are at leastq/12 pairs (c, d) satisfying both conditions. Multiplying by p − 1, which is the num-ber of possible y’s, gives approximately pq/12 good states |c, d, y〉. Combining thiscalculation with the lower bound 1/(20q2) on the probability of observing each goodstate gives us that the probability of observing some good state is at least p/(240q),or at least 1/480 (since q < 2p). Note that each good c has a probability of at least(p−1)/(20q2) ≥ 1/(40q) of being observed, since there p−1 values of y and one valueof d with which c can make a good state |c, d, y〉.

We now want to recover r from a pair c, d such that

− 1

2q≤ d

q+ r

(c(p− 1)− {c(p− 1)}q

(p− 1)q

)≤ 1

2q(mod 1),(6.18)

where this equation was obtained from condition (6.10) by dividing by q. The firstthing to notice is that the multiplier on r is a fraction with denominator p−1, since qevenly divides c(p− 1)−{c(p− 1)}q. Thus, we need only round d/q off to the nearestmultiple of 1/(p− 1) and divide (mod p− 1) by the integer

c′ =c(p− 1)− {c(p− 1)}q

q(6.19)

to find a candidate r. To show that the quantum calculation need only be repeateda polynomial number of times to find the correct r requires only a few more details.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 22: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1505

The problem is that we cannot divide by a number c′ which is not relatively prime top− 1.

For the discrete log algorithm, we do not know that all possible values of c′ aregenerated with reasonable likelihood; we only know this about one-twelfth of them.This additional difficulty makes the next step harder than the corresponding step inthe algorithm for factoring. If we knew the remainder of r modulo all prime powersdividing p−1, we could use the Chinese remainder theorem to recover r in polynomialtime. We will only be able to prove that we can find this remainder for primes largerthan 18, but with a little extra work we will still be able to recover r.

Recall that each good (c, d) pair is generated with probability at least 1/(20q2),and that at least one-twelfth of the possible c’s are in a good (c, d) pair. From (6.19),it follows that these c’s are mapped from c/q to c′/(p− 1) by rounding to the nearestintegral multiple of 1/(p− 1). Further, the good c’s are exactly those in which c/q isclose to c′/(p− 1). Thus, each good c corresponds with exactly one c′. We would liketo show that for any prime power pαii dividing p− 1, a random good c′ is unlikely tocontain pi. If we are willing to accept a large constant for our algorithm, we can justignore the prime powers under 18; if we know r modulo all prime powers over 18, wecan try all possible residues for primes under 18 with only a (large) constant factorincrease in running time. Because at least one-twelfth of the c’s were in a good (c, d)pair, at least one-twelfth of the c′’s are good. Thus, for a prime power pαii , a randomgood c′ is divisible by pαii with probability at most 12/pαii . If we have t good c′’s, theprobability of having a prime power over 18 that divides all of them is therefore atmost ∑

18<pαii

∣∣(p−1)

(12

pαii

)t

,(6.20)

where a|b means that a evenly divides b, so the sum is over all prime powers greaterthan 18 that divide p− 1. This sum (over all integers > 18) converges for t = 2, andgoes down by at least a factor of 2/3 for each further increase of t by 1; thus for someconstant t it is less than 1/2.

Recall that each good c′ is obtained with probability at least 1/(40q) from anyexperiment. Since there are q/12 good c′’s, after 480t experiments, we are likely toobtain a sample of t good c′’s chosen equally likely from all good c′’s. Thus, we willbe able to find a set of c′’s such that all prime powers pαii > 20 dividing p − 1 arerelatively prime to at least one of these c′’s. To obtain a polynomial time algorithm,all one need do is try all possible sets of c′’s of size t; in practice, one would use analgorithm to find sets of c′’s with large common factors. This set gives the residueof r for all primes larger than 18. For each prime pi less than 18, we have at most18 possibilities for the residue modulo pαii , where αi is the exponent on prime pi inthe prime factorization of p− 1. We can thus try all possibilities for residues modulopowers of primes less than 18: for each possibility we can calculate the corresponding rusing the Chinese remainder theorem and then check to see whether it is the desireddiscrete logarithm.

If one were to actually program this algorithm there are many ways in which theefficiency could be increased over the efficiency shown in this paper. For example,the estimate for the number of good c′’s is likely too low, especially since weakerconditions than (6.10) and (6.11) should suffice. This means that the number oftimes the experiment need be run could be reduced. It also seems improbable thatthe distribution of bad values of c′ would have any relationship to primes under 18;

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 23: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1506 PETER W. SHOR

if this is true, we need not treat small prime powers separately.This algorithm does not use very many properties of Zp, so we can use the same

algorithm to find discrete logarithms over other fields such as Zpα , as long as thefield has a cyclic multiplicative group. All we need is that we know the order of thegenerator, and that we can multiply and take inverses of elements in polynomial time.The order of the generator could in fact be computed using the quantum order-findingalgorithm given in section 5 of this paper. Boneh and Lipton [1995] have generalizedthe algorithm so as to be able to find discrete logarithms when the group is abelianbut not cyclic.

7. Comments and open problems. It is currently believed that the most diffi-cult aspect of building an actual quantum computer will be dealing with the problemsof imprecision and decoherence. It was shown by Bernstein and Vazirani [1993] thatthe quantum gates need only have precision O(1/t) in order to have a reasonableprobability of completing t steps of quantum computation; that is, there is a c suchthat if the amplitudes in the unitary matrices representing the quantum gates are allperturbed by at most c/t, the quantum computer will still have a reasonable chanceof producing the desired output. Similarly, the decoherence needs to be only poly-nomially small in t in order to have a reasonable probability of completing t steps ofcomputation successfully. This holds not only for the simple model of decoherencewhere each bit has a fixed probability of decohering at each time step, but also for morecomplicated models of decoherence which are derived from fundamental quantum me-chanical considerations [Unruh 1995; Palma, Suominen, and Ekert 1996; Chuang etal. 1995]. However, building quantum computers with high enough precision and lowenough decoherence to accurately perform long computations may present formidabledifficulties to experimental physicists. In classical computers, error probabilities canbe reduced not only through hardware but also through software, by the use of redun-dancy and error-correcting codes. The most obvious method of using redundancy inquantum computers is ruled out by the theorem that quantum bits cannot be cloned[Peres 1993, section 9-4], but this argument does not rule out more complicated waysof reducing inaccuracy or decoherence using software. In fact, some progress in thedirection of reducing inaccuracy [Berthiaume, Deutsch, and Jozsa 1994] and decoher-ence [Shor 1995] has already been made. The result of Bennett et al. [1996] thatquantum bits can be faithfully transmitted over a noisy quantum channel gives fur-ther hope that quantum computations can similarly be faithfully carried out usingnoisy quantum bits and noisy quantum gates.

Discrete logarithms and factoring are not in themselves widely useful problems.They have become useful only because they have been found to be crucial for public-key cryptography, and this application is in turn possible only because they have beenpresumed to be difficult. This is also true of the generalizations of Boneh and Lipton[1995] of these algorithms. If the only uses of quantum computation remain discretelogarithms and factoring, it will likely become a special-purpose technique whose onlyraison d’etre is to thwart public key cryptosystems. However, there may be otherhard problems which could be solved asymptotically faster with quantum computers.In particular, of interesting problems not known to be NP-complete, the problemof finding a short vector in a lattice [Adleman 1994, Adleman and McCurley 1994]seems as if it might potentially be amenable to solution by a quantum computer.

In the history of computer science, however, most important problems have turnedout to be either polynomial-time or NP-complete. Thus quantum computers will likelynot become widely useful unless they can solve NP-complete problems. Solving NP-

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 24: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1507

complete problems efficiently is a Holy Grail of theoretical computer science whichvery few people expect to be possible on a classical computer. Finding polynomial-time algorithms for solving these problems on a quantum computer would be a mo-mentous discovery. There are some weak indications that quantum computers are notpowerful enough to solve NP-complete problems [Bennett et al. 1997], but I do notbelieve that this potentiality should be ruled out as yet.

Acknowledgments. I would like to thank Jeff Lagarias for finding and fixinga critical error in the first version of the discrete log algorithm. I would also liketo thank him, David Applegate, Charlie Bennett, Gilles Brassard, Andrew Odlyzko,Dan Simon, Bob Solovay, Umesh Vazirani, and correspondents too numerous to list,for productive discussions, for corrections to and improvements of early drafts of thispaper, and for pointers to the literature.

REFERENCES

L. M. Adleman (1994), Algorithmic number theory—The complexity contribution, in Proc. 35thAnnual Symposium on Foundations of Computer Science, IEEE Computer Society Press, LosAlamitos, CA, pp. 88–113.

L. M. Adleman and K. S. McCurley (1994), Open problems in number-theoretic complexity II, inAlgorithmic Number Theory, Proc. 1994 Algorithmic Number Theory Symposium, Ithaca, NY,Lecture Notes in Computer Science 877, L. M. Adleman and M.-D. Huang, eds., Springer-Verlag,Berlin, pp. 291–322.

A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor, T. Sleator,

J. A. Smolin, and H. Weinfurter (1995a), Elementary gates for quantum computation, Phys.Rev. A, 52, pp. 3457–3467.

A. Barenco, D. Deutsch, A. Ekert, and R. Jozsa (1995b), Conditional quantum dynamics andlogic gates, Phys. Rev. Lett., 74, pp. 4083–4086.

D. Beckman, A. N. Chari, S. Devabhaktuni, and J. Preskill (1996), Efficient networks forquantum factoring, Phys. Rev. A, 54, pp. 1034–1063.

P. Benioff (1980), The computer as a physical system: A microscopic quantum mechanical Hamil-tonian model of computers as represented by Turing machines, J. Statist. Phys., 22, pp. 563–591.

P. Benioff (1982a), Quantum mechanical Hamiltonian models of Turing machines, J. Statist. Phys.,29, pp. 515–546.

P. Benioff (1982b), Quantum mechanical Hamiltonian models of Turing machines that dissipateno energy, Phys. Rev. Lett., 48, pp. 1581–1585.

C. H. Bennett (1973), Logical reversibility of computation, IBM J. Res. Develop., 17, pp. 525–532.C. H. Bennett (1989), Time/space trade-offs for reversible computation, SIAM J. Comput., 18,

pp. 766–776.C. H. Bennett, E. Bernstein, G. Brassard, and U. Vazirani (1997), Strengths and weaknesses

of quantum computing, SIAM J. Comput., 26, pp. 1510–1523.C. H. Bennett, G. Brassard, S. Popescu, B. Schumacher, J. A. Smolin, and W. K. Wooters

(1996), Purification of noisy entanglement and faithful teleportation via noisy channels, Phys.Rev. Lett., 76, pp. 722–725.

E. Bernstein and U. Vazirani (1993), Quantum complexity theory, in Proc. 25th Annual ACMSymposium on Theory of Computing, Association for Computing Machinery, New York, pp. 11–20; SIAM J. Comput., 26 (1997), pp. 1411–1473.

A. Berthiaume and G. Brassard (1992), The quantum challenge to structural complexity theory,in Proc. 7th Annual Structure in Complexity Theory Conference, IEEE Computer Society Press,Los Alamitos, CA, pp. 132–137.

A. Berthiaume and G. Brassard (1994), Oracle quantum computing, J. Modern Opt., 41, pp.2521–2535.

A. Berthiaume, D. Deutsch, and R. Jozsa (1994), The stabilisation of quantum computations,in Proc. Workshop on Physics of Computation: PhysComp ’94, IEEE Computer Society Press,Los Alamitos, CA, pp. 60–62.

M. Biafore (1994), Can quantum computers have simple Hamiltonians, in Proc. Workshop onPhysics of Computation: PhysComp ’94, IEEE Computer Society Press, Los Alamitos, CA,pp. 63–68.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 25: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

1508 PETER W. SHOR

D. Boneh and R. J. Lipton (1995), Quantum cryptanalysis of hidden linear functions, Advancesin Cryptology—CRYPTO ’95, Proc. 15th Annual International Cryptology Conference, SantaBarbara, CA, D. Coppersmith, ed. Springer-Verlag, Berlin, pp. 424–437.

J. F. Canny and J. Reif (1987), New lower bound techniques for robot motion planning problems,in Proc. 28th Annual Symposium on Foundations of Computer Science, IEEE Computer SocietyPress, Los Alamitos, CA, pp. 49–60.

J. Choi, J. Sellen, and C.-K. Yap (1995), Precision-sensitive Euclidean shortest path in 3-space,in Proc. 11th Annual Symposium on Computational Geometry, Association for Computing Ma-chinery, New York, pp. 350–359.

I. L. Chuang, R. Laflamme, P. W. Shor, and W. H. Zurek (1995), Quantum computers, factoringand decoherence, Science, 270, pp. 1633–1635.

I. L. Chuang and Y. Yamamoto (1995), A simple quantum computer, Phys. Rev. A, 52, pp. 3489–3496.

A. Church (1936), An unsolvable problem of elementary number theory, Amer. J. Math., 58, pp. 345–363.

J. I. Cirac and P. Zoller (1995), Quantum computations with cold trapped ions, Phys. Rev. Lett.,74, pp. 4091–4094.

R. Cleve (1994), A note on computing Fourier transforms by quantum programs, preprint.D. Coppersmith (1994), An Approximate Fourier Transform Useful in Quantum Factoring, IBM

Research Report RC 19642.D. Deutsch (1985), Quantum theory, the Church–Turing principle and the universal quantum com-

puter, Proc. Roy. Soc. London Ser. A, 400, pp. 96–117.D. Deutsch (1989), Quantum computational networks, Proc. Roy. Soc. London Ser. A, 425, pp. 73–

90.D. Deutsch, A. Barenco, and A. Ekert (1995), Universality of quantum computation, Proc. Roy.

Soc. London Ser. A, 449, pp. 669-677.D. Deutsch and R. Jozsa (1992), Rapid solution of problems by quantum computation, Proc. Roy.

Soc. London Ser. A, 439, pp. 553–558.D. P. DiVincenzo (1995), Two-bit gates are universal for quantum computation, Phys. Rev. A, 51,

pp. 1015–1022.A. Ekert and R. Jozsa (1996), Shor’s quantum algorithm for factorising numbers, Rev. Mod.

Phys., 68, pp. 733–753.R. Feynman (1982), Simulating physics with computers, Internat. J. Theoret. Phys., 21, pp. 467–488.R. Feynman (1986), Quantum mechanical computers, Found. Phys., 16, pp. 507–531; originally

published in Optics News (February 1985), pp. 11–20.E. Fredkin and T. Toffoli (1982), Conservative logic, Internat. J. Theoret. Phys., 21, pp. 219–253.D. M. Gordon (1993), Discrete logarithms in GF(p) using the number field sieve, SIAM J. Discrete

Math., 6, pp. 124–139.R. B. Griffiths and C.-S. Niu (1996), Semiclassical Fourier tranform for quantum computation,

Phys. Rev. Lett., 76, pp. 3228–3231.G. H. Hardy and E. M. Wright (1979), An Introduction to the Theory of Numbers, 5th ed., Oxford

University Press, New York.J. Hartmanis and J. Simon (1974), On the power of multiplication in random access machines, in

Proc. 15th Annual Symposium on Switching and Automata Theory, IEEE Computer Society,Long Beach, CA, pp. 13–23.

A. Karatsuba and Yu. Ofman (1962), Multiplication of multidigit numbers on automata, Dokl.Akad. Nauk SSSR, 145, pp. 293–294 (in Russian); Sov. Phys. Dokl., 7 (1963), pp. 595–596(English translation).

E. Knill (1995), personal communication.D. E. Knuth (1981), The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 2nd

ed., Addison-Wesley, Reading, MA.R. Landauer (1995), Is quantum mechanics useful? Philos. Trans. Roy. Soc. London Ser. A, 353,

pp. 367-376.R. Landauer (1997), Is quantum mechanically coherent computation useful?, in Proc. Drexel-4

Symposium on Quantum Nonintegrability—Quantum Classical Correspondence, D. H. Feng andB-L. Hu, eds., International Press, Cambridge, MA, to appear.

Y. Lecerf (1963), Machines de Turing reversibles. Recursive insolubilite en n ∈ N de l’equationu = θnu, ou θ est un isomorphisme de codes, C. R. Acad. Francaise Sci., 257, pp. 2597–2600.

A. K. Lenstra and H. W. Lenstra, Jr., eds. (1993), The Development of the Number Field Sieve,Lecture Notes in Mathematics 1554, Springer-Verlag, Berlin.

A. K. Lenstra, H. W. Lenstra, Jr., M. S. Manasse, and J. M. Pollard (1990), The number fieldsieve, in Proc. 22nd Annual ACM Symposium on Theory of Computing, Association for Com-puting Machinery, New York, pp. 564–572; expanded version appears in Lenstra and Lenstra,

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php

Page 26: POLYNOMIAL-TIME ALGORITHMS FOR PRIME FACTORIZATION

PRIME FACTORIZATION ON A QUANTUM COMPUTER 1509

Jr. [1993], pp. 11–42.R. Y. Levine and A. T. Sherman (1990), A note on Bennett’s time-space tradeoff for reversible

computation, SIAM J. Comput., 19, pp. 673–677.S. Lloyd (1993), A potentially realizable quantum computer, Science, 261, pp. 1569–1571.S. Lloyd (1994), Envisioning a quantum supercomputer, Science, 263, p. 695.S. Lloyd (1995), Almost any quantum logic gate is universal, Phys. Rev. Lett., 75, pp. 346–349.N. Margolus (1986), Quantum computation, Ann. New York Acad. Sci., 480, pp. 487–497.N. Margolus (1990), Parallel quantum computation, in Complexity, Entropy and the Physics of

Information, Santa Fe Institute Studies in the Sciences of Complexity, Vol. VIII, W. H. Zurek,ed., Addison-Wesley, Reading, MA, pp. 273–287.

G. L. Miller (1976), Riemann’s hypothesis and tests for primality, J. Comput. System Sci., 13,pp. 300–317.

A. M. Odlyzko (1995), personal communication.G. M. Palma, K.-A. Suominen, and A. K. Ekert (1996), Quantum computers and dissipation,

Proc. Roy. Soc. London Ser. A, 452, pp. 567–584.A. Peres (1993), Quantum Theory: Concepts and Methods, Kluwer Academic Publishers, Dordrecht,

The Netherlands.C. Pomerance (1987), Fast, rigorous factorization and discrete logarithm algorithms, in Discrete

Algorithms and Complexity, Proc. Japan–US Joint Seminar, 1986, Kyoto, D. S. Johnson, T.Nishizeki, A. Nozaki, and H. S. Wilf, eds., Academic Press, New York, pp. 119–143.

E. Post (1936), Finite combinatory processes. Formulation I, J. Symbolic Logic, 1, pp. 103–105.R. L. Rivest, A. Shamir, and L. Adleman (1978), A method of obtaining digital signatures and

public-key cryptosystems, Comm. Assoc. Comput. Mach., 21, pp. 120–126.L. A. Rubel (1989), Digital simulation of analog computation and Church’s thesis, J. Symbolic

Logic, 54, pp. 1011–1017.A. Schonhage (1982), Asymptotically fast algorithms for the numerical multiplication and division

of polynomials with complex coefficients, in Computer Algebra EUROCAM ’82, Lecture Notesin Computer Science 144, J. Calmet, ed., Springer-Verlag, Berlin, pp. 3–15.

A. Schonhage, A. F. W. Grotefeld, and E. Vetter (1994), Fast Algorithms: A Multitape TuringMachine Implementation, B. I. Wissenschaftsverlag, Mannheim, Germany.

A. Schonhage and V. Strassen (1971), Schnelle Multiplikation grosser Zahlen, Computing, 7,pp. 281–292.

P. W. Shor (1994), Algorithms for quantum computation: Discrete logarithms and factoring, inProc. 35th Annual Symposium on Foundations of Computer Science, IEEE Computer SocietyPress, Los Alamitos, CA, pp. 124–134.

P. W. Shor (1995), Scheme for reducing decoherence in quantum computer memory, Phys. Rev. A,52, pp. 2493–2496.

D. Simon (1994), On the power of quantum computation, in Proc. 35th Annual Symposium onFoundations of Computer Science, IEEE Computer Society Press, Los Alamitos, CA, pp. 116–123; SIAM J. Comput., 26 (1997), pp. 1474–1483.

T. Sleator and H. Weinfurter (1995), Realizable universal quantum logic gates, Phys. Rev. Lett.,74, pp. 4087–4090.

R. Solovay (1995), personal communication.K. Steiglitz (1988), Two non-standard paradigms for computation: Analog machines and cellular

automata, in Performance Limits in Communication Theory and Practice, Proc. NATO Ad-vanced Study Institute, Il Ciocco, Castelvecchio Pascoli, Tuscany, Italy, 1986, J. K. Skwirzynski,ed., Kluwer Academic Publishers, Dordrecht, The Netherlands, pp. 173–192.

W. G. Teich, K. Obermayer, and G. Mahler (1988), Structural basis of multistationary quantumsystems II: Effective few-particle dynamics, Phys. Rev. B, 37, pp. 8111–8121.

T. Toffoli (1980), Reversible computing, in Automata, Languages and Programming, 7th Collo-quium, Lecture Notes in Computer Science 84, J. W. de Bakker and J. van Leeuwen, eds.,Springer-Verlag, Berlin, pp. 632–644.

A. M. Turing (1936), On computable numbers, with an application to the Entscheidungsproblem,in Proc. London Math. Soc. (2), 42, pp. 230–265; corrections in Proc. London Math. Soc. (2),43 (1937), pp. 544–546.

W. G. Unruh (1995), Maintaining coherence in quantum computers, Phys. Rev. A, 51, pp. 992–997.P. van Emde Boas (1990), Machine models and simulations, in Handbook of Theoretical Computer

Science, Vol. A, J. van Leeuwen, ed., Elsevier, Amsterdam, pp. 1–66.A. Vergis, K. Steiglitz, and B. Dickinson (1986), The complexity of analog computation, Math.

Comput. Simulation, 28, pp. 91–113.A. Yao (1993), Quantum circuit complexity, in Proc. 34th Annual Symposium on Foundations of

Computer Science, IEEE Computer Society Press, Los Alamitos, CA, pp. 352–361.

Dow

nloa

ded

10/1

8/16

to 1

24.1

6.14

8.10

. Red

istr

ibut

ion

subj

ect t

o SI

AM

lice

nse

or c

opyr

ight

; see

http

://w

ww

.sia

m.o

rg/jo

urna

ls/o

jsa.

php