job shop scheduling solver based on quantum annealingjob shop scheduling solver based on quantum...

15
Job Shop Scheduling Solver based on Quantum Annealing Davide Venturelli 1,2 , Dominic J.J. Marchand 3 , Galo Rojo 3 1 Quantum Artificial Intelligence Laboratory (QuAIL), NASA Ames 2 U.S.R.A. Research Institute for Advanced Computer Science (RIACS) 3 1QB Information Technologies (1QBit) Quantum annealing is emerging as a promising near-term quantum computing approach to solv- ing combinatorial optimization problems. A solver for the job-shop scheduling problem that makes use of a quantum annealer is presented in detail. Inspired by methods used for constraint satisfac- tion problem (CSP) formulations, we first define the makespan-minimization problem as a series of decision instances before casting each instance into a time-indexed quadratic unconstrained binary optimization. Several pre-processing and graph-embedding strategies are employed to compile opti- mally parametrized families of problems for scheduling instances on the D-Wave Systems’ Vesuvius quantum annealer (D-Wave Two). Problem simplifications and partitioning algorithms, including variable pruning, are discussed and the results from the processor are compared against classical global-optimum solvers. I. I. INTRODUCTION The commercialization and independent benchmark- ing [1–4] of quantum annealers based on superconduct- ing qubits has sparked a surge of interest for near-term practical applications of quantum analog computation in the optimization research community. Many of the early proposals for running useful problems arising in space science [5] have been adapted and have seen small-scale testing on the D-Wave Two processor [6]. The best proce- dure for comparison of quantum analog performance with traditional digital methods is still under debate [3, 7, 8] and remains mostly speculative due to the limited num- ber of qubits on the currently available hardware. While waiting for the technology to scale up to more significant sizes, there is an increasing interest in the identification of small problems which are nevertheless computation- ally challenging and useful. One approach in this direc- tion has been pursued in [9], and consisted in identifying parametrized ensembles of random instances of opera- tional planning problems of increasing sizes that can be shown to be on the verge of a solvable-unsolvable phase transition. This condition should be sufficient to observe an asymptotic exponential scaling of runtimes, even for instances of relatively small size, potentially testable on current- or next-generation D-Wave hardware. An em- pirical takeaway from [6] (validated also by experimen- tal results in [10, 11]) was that the established program- ming and program running techniques for quantum an- nealers seem to be particularly amenable to scheduling problems, allowing for an efficient mapping and good performance compared to other applied problem classes like automated navigation and Bayesian-network struc- ture learning [12]. Motivated by these first results, and with the inten- tion to challenge current technologies on hard problems of practical value, we herein formulate a quantum an- nealing version of the job-shop scheduling problem (JSP). The JSP is essentially a general paradigmatic constraint satisfaction problem (CSP) framework for the problem of optimizing the allocation of resources required for the execution of sequences of operations with constraints on location and time. We provide compilation and running strategies for this problem using original and traditional techniques for parametrizing ensembles of instances. Re- sults from the D-Wave Two are compared with classical exact solvers. The JSP has earned a reputation for be- ing especially intractable, a claim supported by the fact that the best general-purpose solvers (CPLEX, Gurobi Optimizer, SCIP) struggle with instances as small as 10 machines and 10 jobs (10 x 10) [13]. Indeed, some known 20 x 15 instances often used for benchmarking still have not been solved to optimality even by the best special- purpose solvers [14], and 20 x 20 instances are typically completely intractable. We note that this early work con- stitutes a wide-ranging survey of possible techniques and research directions and leave a more in-depth exploration of these topics for future work. A. Problem definition and conventions Typically the JSP consists of a set of jobs J = {j 1 ,..., j N } that must be scheduled on a set of machines M = {m 1 ,..., m M }. Each job consists of a sequence of operations that must be performed in a predefined order j n = {O n1 O n2 →···→ O nLn }. Job j n is assumed to have L n operations. Each opera- tion O nj has an integer execution time p nj (a value of zero is allowed) and has to be executed by an assigned machine m qnj ∈M, where q nj is the index of the as- signed machine. There can only be one operation run- ning on any given machine at any given point in time and each operation of a job needs to complete before the fol- lowing one can start. The usual objective is to schedule all operations in a valid sequence while minimizing the makespan (i.e., the completion time of the last running job), although other objective functions can be used. In what follows, we will denote with T the minimum possi- ble makespan associated with a given JSP instance. arXiv:1506.08479v2 [quant-ph] 17 Oct 2016

Upload: others

Post on 30-Jan-2021

7 views

Category:

Documents


1 download

TRANSCRIPT

  • Job Shop Scheduling Solver based on Quantum Annealing

    Davide Venturelli1,2, Dominic J.J. Marchand3, Galo Rojo31Quantum Artificial Intelligence Laboratory (QuAIL), NASA Ames

    2U.S.R.A. Research Institute for Advanced Computer Science (RIACS)31QB Information Technologies (1QBit)

    Quantum annealing is emerging as a promising near-term quantum computing approach to solv-ing combinatorial optimization problems. A solver for the job-shop scheduling problem that makesuse of a quantum annealer is presented in detail. Inspired by methods used for constraint satisfac-tion problem (CSP) formulations, we first define the makespan-minimization problem as a series ofdecision instances before casting each instance into a time-indexed quadratic unconstrained binaryoptimization. Several pre-processing and graph-embedding strategies are employed to compile opti-mally parametrized families of problems for scheduling instances on the D-Wave Systems’ Vesuviusquantum annealer (D-Wave Two). Problem simplifications and partitioning algorithms, includingvariable pruning, are discussed and the results from the processor are compared against classicalglobal-optimum solvers.

    I. I. INTRODUCTION

    The commercialization and independent benchmark-ing [1–4] of quantum annealers based on superconduct-ing qubits has sparked a surge of interest for near-termpractical applications of quantum analog computation inthe optimization research community. Many of the earlyproposals for running useful problems arising in spacescience [5] have been adapted and have seen small-scaletesting on the D-Wave Two processor [6]. The best proce-dure for comparison of quantum analog performance withtraditional digital methods is still under debate [3, 7, 8]and remains mostly speculative due to the limited num-ber of qubits on the currently available hardware. Whilewaiting for the technology to scale up to more significantsizes, there is an increasing interest in the identificationof small problems which are nevertheless computation-ally challenging and useful. One approach in this direc-tion has been pursued in [9], and consisted in identifyingparametrized ensembles of random instances of opera-tional planning problems of increasing sizes that can beshown to be on the verge of a solvable-unsolvable phasetransition. This condition should be sufficient to observean asymptotic exponential scaling of runtimes, even forinstances of relatively small size, potentially testable oncurrent- or next-generation D-Wave hardware. An em-pirical takeaway from [6] (validated also by experimen-tal results in [10, 11]) was that the established program-ming and program running techniques for quantum an-nealers seem to be particularly amenable to schedulingproblems, allowing for an efficient mapping and goodperformance compared to other applied problem classeslike automated navigation and Bayesian-network struc-ture learning [12].

    Motivated by these first results, and with the inten-tion to challenge current technologies on hard problemsof practical value, we herein formulate a quantum an-nealing version of the job-shop scheduling problem (JSP).The JSP is essentially a general paradigmatic constraintsatisfaction problem (CSP) framework for the problem

    of optimizing the allocation of resources required for theexecution of sequences of operations with constraints onlocation and time. We provide compilation and runningstrategies for this problem using original and traditionaltechniques for parametrizing ensembles of instances. Re-sults from the D-Wave Two are compared with classicalexact solvers. The JSP has earned a reputation for be-ing especially intractable, a claim supported by the factthat the best general-purpose solvers (CPLEX, GurobiOptimizer, SCIP) struggle with instances as small as 10machines and 10 jobs (10 x 10) [13]. Indeed, some known20 x 15 instances often used for benchmarking still havenot been solved to optimality even by the best special-purpose solvers [14], and 20 x 20 instances are typicallycompletely intractable. We note that this early work con-stitutes a wide-ranging survey of possible techniques andresearch directions and leave a more in-depth explorationof these topics for future work.

    A. Problem definition and conventions

    Typically the JSP consists of a set of jobs J ={j1, . . . , jN} that must be scheduled on a set of machinesM = {m1, . . . ,mM}. Each job consists of a sequence ofoperations that must be performed in a predefined order

    jn = {On1 → On2 → · · · → OnLn}.

    Job jn is assumed to have Ln operations. Each opera-tion Onj has an integer execution time pnj (a value ofzero is allowed) and has to be executed by an assignedmachine mqnj ∈ M, where qnj is the index of the as-signed machine. There can only be one operation run-ning on any given machine at any given point in time andeach operation of a job needs to complete before the fol-lowing one can start. The usual objective is to scheduleall operations in a valid sequence while minimizing themakespan (i.e., the completion time of the last runningjob), although other objective functions can be used. Inwhat follows, we will denote with T the minimum possi-ble makespan associated with a given JSP instance.

    arX

    iv:1

    506.

    0847

    9v2

    [qu

    ant-

    ph]

    17

    Oct

    201

    6

  • 2

    As defined above, the JSP variant we consider is de-noted JM

    ∣∣pnj ∈ [pmin, . . . , pmax]∣∣Cmax in the well-knownα|β|γ notation, where pmin and pmax are the smallest andlargest execution times allowed, respectively. In this no-tation, JM stands for job-shop type onM machines, andCmax means we are optimizing the makespan.

    For notational convenience, we enumerate the opera-tions in a lexicographical order in such a way that

    j1 = {O1 → · · · → Ok1},j2 = {Ok1+1 → · · · → Ok2},

    . . .

    jN = {OkN−1+1 → · · · → OkN }. (1)

    Given the running index over all operations i ∈{1, . . . , kN}, we let qi be the index of the machine mqi re-sponsible for executing operation Oi. We define Im to bethe set of indices of all of the operations that have to beexecuted on machine mm, i.e., Im = {i : qi = m}. Theexecution time of operation Oi is now simply denoted pi.

    A priori, a job can use the same machine more thanonce, or use only a fraction of the M available machines.For benchmarking purposes, it is customary to restrict astudy to the problems of a specific family. In this work,we define a ratio θ that specifies the fraction of the totalnumber of machines that is used by each job, assumingno repetition when θ ≤ 1. For example, a ratio of 0.5means that each job uses only 0.5M distinct machines.

    B. Quantum annealing formulation

    In this work, we seek a suitable formulation of the JSPfor a quantum annealing optimizer (such as the D-WaveTwo). The optimizer is best described as an oracle thatsolves an Ising problem with a given probability [15].This Ising problem is equivalent to a quadratic uncon-strained binary optimization (QUBO) problem [10]. Thebinary polynomial associated with a QUBO problem canbe depicted as a graph, with nodes representing variablesand values attached to nodes and edges representing lin-ear and quadratic terms, respectively. The QUBO solvercan similarly be represented as a graph where nodes rep-resents qubits and edges represent the allowed connec-tivity. The optimizer is expected to find the global min-imum with some probability which itself depends on theproblem and the device’s parameters. The device is notan ideal oracle: its limitations, with regard to precision,connectivity, and number of variables, must be consid-ered to achieve the best possible results. As is customary,we rely on the classical procedure known as embeddingto adapt the connectivity of the solver to the problemat hand. This procedure is described in a number ofquantum annealing papers [6, 11]. During this proce-dure, two or more variables can be forced to take onthe same value by including additional constraints in themodel. In the underlying Ising model, this is achievedby introducing a large ferromagnetic (negative) coupling

    JF between two spins. The embedding process modifiesthe QUBO problem accordingly and one should not con-fuse the logical QUBO problem value, which depends onthe QUBO problem and the state considered, with theIsing problem energy seen by the optimizer (which addi-tionally depends on the extra constraints and the solver’sparameters, such as JF).

    We distinguish between the optimization version of theJSP, in which we seek a valid schedule with a mini-mal makespan, and the decision version, which is lim-ited to validating whether or not a solution exists with amakespan smaller than or equal to a user-specified time-span T . We focus exclusively on the decision versionand later describe how to implement a full optimizationversion based on a binary search. We note that the deci-sion formulation where jobs are constrained to fixed timewindows is sometimes referred in the literature as thejob-shop CSP formulation [16, 17], and our study willrefer to those instances where the jobs share a commondeadline T .

    II. II. QUBO PROBLEM FORMULATION

    While there are several ways the JSP can be formu-lated, such as the rank-based formulation [18] or the dis-junctive formulation [19], our formulation is based on astraightforward time-indexed representation particularlyamenable to quantum annealers (a comparative study ofmappings for planning and scheduling problems can befound in [10]). We assign a set of binary variables for eachoperation, corresponding to the various possible discretestarting times the operation can have:

    xi,t =

    {1 : operation Oi starts at time t,0 : otherwise. (2)

    Here t is bounded from above by the timespan T , whichrepresents the maximum time we allow for the jobs tocomplete. The timespan itself is bounded from above bythe total work of the problem, that is, the sum of theexecution times of all operations.

    A. Constraints

    We account for the various constraints by addingpenalty terms to the QUBO problem. For example, anoperation must start once and only once, leading to theconstraint and associated penalty function(∑

    t

    xi,t = 1 for each i

    )→∑i

    (∑t

    xi,t − 1

    )2. (3)

    There can only be one job running on each machine atany given point in time, which expressed as quadraticconstraints yields∑

    (i,t,k,t′)∈Rm

    xi,txk,t′ = 0 for each m, (4)

  • 3

    Machine 1 Machine 2 Machine 3

    t1

    t2

    t3t4

    t5

    t6

    t7

    1,1

    1,2

    1,3 2,3

    2,4

    3,5

    3,6

    4,1

    4,2

    5,3

    5,4 6,4

    6,5

    7,1

    7,2

    7,3

    8,2

    8,3

    9,4

    9,5

    Machine 2 Machine 3

    8,4

    9,3

    2,5

    3,4

    a)

    b)

    d)

    Machine 1

    Machine 1 Machine 2 Machine 3

    c)

    m3, p = 1operation 1 operation 2 operation 3

    j2 m2, p = 2m3, p = 2

    t1

    t2

    t3t4

    t5

    t6

    t7

    t1

    t2

    t3t4

    t5

    t6

    t7

    m2, p = 1m1, p = 1m1, p = 1

    m1, p = 2m3, p = 2m2, p = 1j3

    j1

    j1 j3j2

    j1 j3j2

    j1 j3j2

    FIG. 1: a) Table representation of an example 3 x 3 instancewhose execution times have been randomly selected to be ei-ther 1 or 2 time units. b) Pictorial view of the QUBO map-ping of the above example for HT=6. Green, purple, andcyan edges refer respectively to h1, h2, and h3 quadratic cou-pling terms (Eqs. 7–9). Each circle represents a bit with itsi, t index as in Eq. 2. c) The same QUBO problem as in(b) after the variable pruning procedure detailed in the sec-tion on QUBO formulation refinements. Isolated qubits arebits with fixed assignments that can be eliminated from thefinal QUBO problem. d) The same QUBO problem as in(b) for HT=7. Previously displayed edges in the above figureare omitted. Red edges/circles represent the variations withrespect to HT=6. Yellow stars indicate the bits which arepenalized with local fields for timespan discrimination.

    where Rm = Am ∪Bm and

    Am = {(i, t, k, t′) : (i, k) ∈ Im × Im,i 6= k, 0 ≤ t, t′ ≤ T, 0 < t′ − t < pi},

    Bm = {(i, t, k, t′) : (i, k) ∈ Im × Im,i < k, t′ = t, pi > 0, pj > 0}.

    The set Am is defined so that the constraint forbids oper-ation Oj from starting at t′ if there is another operationOi still running, which happens if Oi started at time tand t′ − t is less than pi. The set Bm is defined so thattwo jobs cannot start at the same time, unless at leastone of them has an execution time equal to zero. Finally,the order of the operations within a job are enforced with∑

    kn−1

  • 4

    an effective release time for each operation correspond-ing to the sum of the execution times of the precedingoperations in the same job. A similar upper bound cor-responding to the timespan minus all of the executiontimes of the subsequent operations of the same job canbe set. The bits corresponding to these invalid startingtimes can be eliminated from the QUBO problem alto-gether since any valid solution would require them to bestrictly zero. This simplification eliminates an estimatednumber of variables equal to NM (M 〈p〉 − 1), where 〈p〉represents the average execution time of the operations.This result can be generalized to consider the previouslydefined ratio θ, such that the total number of variablesrequired after this simple QUBO problem pre-processingis θNM [T − θM〈p〉+ 1].

    III. III. QUBO FORMULATION REFINEMENTS

    Although the above formulation proves sufficient forrunning JSPs on the D-Wave machine, we explore a fewpotential refinements. The first pushes the limit of simplevariable pruning by considering more advanced criteriafor reducing the possible execution window of each task.The second refinement proposes a compromise betweenthe decision version of the JSP and a full optimizationversion.

    A. Window shaving

    In the time-index formalism, reducing the executionwindows of operations (i.e., shaving) [20], or in the dis-junctive approach, adjusting the heads and tails of opera-tions [21, 22], or more generally, by applying constraintspropagation techniques (e.g. [23]), together constitutethe basis for a number of classical approaches to solvingthe JSP. Shaving is sometimes used as a pre-processingstep or as a way to obtain a lower bound on the makespanbefore applying other methods. The interest from ourperspective is to showcase how such classical techniquesremain relevant, without straying from our quantum an-nealing approach, when applied to the problem of prun-ing as many variables as possible. This enables largerproblems to be considered and improves the success rateof embeddability in general (see Figure 3), without sig-nificantly affecting the order of magnitude of the overalltime to solution in the asymptotic regime. Further im-mediate advantages of reducing the required number ofqubits become apparent during the compilation of JSPinstances for the D-Wave device due to the associatedembedding overhead that depends directly on the num-ber of variables. The shaving process is typically handledby a classical algorithm whose worst-case complexity re-mains polynomial. While this does not negatively impactthe fundamental complexity of solving JSP instances, forpragmatic benchmarking the execution time needs to betaken into account and added to the quantum annealing

    runtime to properly report the time to solution of thewhole algorithm.

    Different elimination rules can be applied. We focusherein on the iterated Carlier and Pinson (ICP) proce-dure [21] reviewed in the appendix with worst-case com-plexity given by O(N2M2T log(N)). Instead of lookingat the one-job sub-problems and their constraints to elim-inate variables, as we did for the simple pruning, we lookat the one-machine sub-problems and their associatedconstraints to further prune variables. An example ofthe resulting QUBO problem is presented in Figure 1-c.

    B. Timespan discrimination

    We explore a method of extracting more informationregarding the actual optimal makespan of a problemwithin a single call to the solver by breaking the de-generacy of the ground states and spreading them oversome finite energy scale, distinguishing the energy of validschedules on the basis of their makespan. Taken to theextreme, this approach would amount to solving the fulloptimization problem. We find that the resulting QUBOproblem is poorly suited to a solver with limited preci-sion, so a balance must be struck between extra informa-tion and the precision requirement. A systematic studyof how best to balance the amount of information ob-tained versus the extra cost will be the subject of futurework.

    We propose to add a number of linear terms, or localfields, to the QUBO problem to slightly penalize validsolutions with larger makespans. We do this by addinga cost to the last operation of each job, that is, the set{Ok1 , . . . , OkN }. At the same time, we require that thenew range of energy over which the feasible solutions arespread stays within the minimum logical QUBO prob-lem’s gap given by ∆E = min{η, α, β}. If the solver’sprecision can accomodate K distinguishable energy bins,then makespans within [T − K, T ] can be immediatelyidentified from their energy values. The procedure is il-lustrated in Figure 1-d and some implications are dis-cussed in the appendix.

    IV. IV. ENSEMBLEPRE-CHARACTERIZATION AND

    COMPILATION

    We now turn to a few important elements of our com-putational strategy for solving JSP instances. We firstshow how a careful pre-characterization of classes of ran-dom JSP instances, representative of the problems to berun on the quantum optimizer, provides very useful in-formation regarding the shape of the search space for T .We then describe how instances are compiled to run onthe actual hardware.

  • 5

    A. Makespan Estimation

    In Figure 2, we show the distributions of the opti-mal makespans T for different ensembles of instancesparametrized by their size N = M , by the possible val-ues of task durations Pp = {pmin, . . . , pmax}, and by theratio θ ≤ 1 of the number of machines used by eachjob. Instances are generated randomly by selecting θMdistinct machines for each job and assigning an executiontime to each operation randomly. For each set of parame-ters, we can compute solutions with a classical exhaustivesolver in order to identify the median of the distribution〈T 〉(N,Pp, θ) as well as the other quantiles. These couldalso be inferred from previously solved instances withthe proposed annealing solver. The resulting informa-tion can be used to guide the binary search required tosolve the optimization problem. Figure 2 indicates thata normal distribution is an adequate approximation, sowe need only to estimate its average 〈T 〉 and varianceσ2. Interestingly, from the characterization of the fam-ilies of instances up to N = 10 we find that, at leastin the region explored, the average minimum makespan〈T 〉 is proportional to the average execution time of ajob 〈p〉θN , where 〈p〉 is the mean of Pp. This linearansatz allows for the extrapolation of approximate re-source requirements for classes of problems which havenot yet been pre-characterized, and it constitutes an ed-ucated guess for classes of problems which cannot be pre-characterized due to their difficulty or size. The accu-racy of these functional forms was verified by computingthe relative error of the prediction versus the fit of themakespan distribution of each parametrized family up toN = M = 9 and pmax = 20 using 200 instances to com-pute the makespan histogram. The prediction for 〈T 〉results are consistently at least 95% accurate, while theone for σ has at worst a 30% error margin, a very ap-proximate but sufficient model for the current purpose ofguiding the binary search.

    B. Compilation

    The graph-minor embedding technique (abbreviatedsimply “embedding”) represents the de facto method ofrecasting the Ising problems to a form compatible withthe layout of the annealer’s architecture [24, 25], whichfor the D-Wave Two is a Chimera graph [1]. Formally,we seek an isomorphism between the problem’s QUBOgraph and a graph minor of the solver. This procedurehas become a standard in solving applied problems usingquantum annealing [6, 11] and can be thought of as theanalogue of compilation in a digital computer program-ming framework during which variables are assigned tohardware registers and memory locations. This processis covered in more details in the appendix. An example ofembedding for a 5 x 5 JSP instance with θ = 1 and T = 7is shown in Figure 3-a, where the 72 logical variables ofthe QUBO problem are embedded using 257 qubits of the

    Number of machines M, number of jobs N

    Ope

    ratio

    n ex

    ecut

    ion

    time

    dist

    ribut

    ion

    Optimal makespan

    Prob

    abili

    ty N = M = 3

    p = [0, 1]

    N = M = 3 = [1]

    N = M = 3 p = [0, 2]

    N = M = 3 p = [1, 2]

    p = [0, 3]

    N = M = 4 p = [0, 1]

    N = M = 4 = [1]

    N = M = 4 p = [0, 2]

    N = M = 4 p = [1, 2]

    N = M = 4 p = [0, 3]

    N = M = 5 p = [0, 1]

    N = M = 5 = [1]

    N = M = 5 p = [0, 2]

    N = M = 5 p = [1, 2]

    N = M = 5 p = [0, 3]

    N = M = 6 p = [0, 1]

    N = M = 6 p = [1]

    N = M = 6 p = [0, 2]

    N = M = 6 p = [1, 2]

    N = M = 6 p = [0, 3]

    a)N = M = 3

    FIG. 2: a) Normalized histograms of optimal makespans Tfor parametrized families of JSP instances with N = M ,Pp on the y-axis, θ = 1 (yellow), and θ = 0.5 (purple).The distributions are histograms of occurrences for 1000 ran-dom instances, fitted with a Gaussian function of mean 〈T 〉.We note that the width of the distributions increases as therange of the execution times Pp increases, for fixed 〈p〉. Themean and the variance are well fitted respectively by 〈T 〉 =ATNpmin+BTNpmax and σ = σ0+Cσ〈T 〉+Aσpmin+Bσpmax,with AT = 0.67, BT = 0.82, σ0 = 0.7, Aσ = −0.03,Bσ = 0.43, and Cσ = 0.003.

    Chimera graph. Finding the optimal tiling that uses thefewest qubits is NP-hard [26], and the standard approachis to employ heuristic algorithms [27]. In general, for theembedding of time-indexed mixed-integer programmingQUBO problems of size N into a graph of degree k, oneshould expect a quadratic overhead in the number of bi-nary variables on the order of aN2, with a ≤ (k − 2)−1depending on the embedding algorithm and the hardwareconnectivity [11]. This quadratic scaling is apparent inFigure 3-b where we report on the compilation attemptsusing the algorithm in [27]. Results are presented for theD-Wave chip installed at NASA Ames at the time of thisstudy, for a larger chip with the same size of Chimerablock and connectivity pattern (like the latest chip cur-rently being manufactured by D-Wave Systems), and fora speculative yet-larger chip where the Chimera blockis twice as large. We deem a JSP instance embeddablewhen the respectiveHT=T is embeddable, so the decreasein probability of embedding with increasing system size isclosely related to the shift and spreading of the optimalmakespan distributions for ensembles of increasing size(see Figure 2). What we observe is that, with the avail-able algorithms, the current architecture admits embed-ded JSP instances whose total execution time NMθ〈p〉is around 20 time units, while near-future (we estimate 2years) D-Wave chip architectures could potentially dou-ble that. As noted in similar studies (e.g., [6]), graphconnectivity has a much more dramatic impact on em-beddability than qubit count.

  • 6

    a)

    b)

    5 10 15 20

    qubits

    5 10 15 20

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    1.6

    qubits

    Opt

    imal

    JF

    c) θ = 1

    ��

    � �

    � � ��

    ��

    ��

    � � �

    4 5 7 81

    510

    100

    6Problem size N = M

    Em

    bedd

    ing

    (%)

    12x12x(8,8)

    12x12x(4,4)8x8x(4,4)

    θ = 0.5

    FIG. 3: a) Example of an embedded JSP instance on NASA’sD-Wave Two chip. Each chain of qubits is colored to representa logical binary variable determined by the embedding. Forclarity, active connections between the qubits are not shown.b) Embedding probability as a function of N = M for θ = 1(similar results are observed for θ = 0.5). Solid lines refer toPp = [1, 1] and dashed lines refer to Pp = [0, 2]. 1000 randominstances have been generated for each point, and a cutoffof 2 minutes has been set for the heuristic algorithm to finda valid topological embedding. Results for different sizes ofChimera are shown. c) Optimal parameter-setting analysisfor the ensembles of JSP instances we studied. Each pointcorresponds to the number of qubits and the optimal JF (seemain text) of a random instance, and each color represents aparametrized ensemble (green: 3 x 3, purple: 4 x 4, yellow:5 x 5, blue: 6 x 6; darker colors represent ensembles with Pp =[1, 1] as opposed to lighter colors which indicate Pp = [0, 2]).Distributions on the right of scatter plots represent Gaussianfits of the histogram of the optimal JF for each ensemble.Runtime results are averaged over an ungauged run and 4additional runs with random gauges [28].

    Once the topological aspect of embedding has beensolved, we set the ferromagnetic interactions needed toadapt the connectivity of the solver to the problem athand. For the purpose of this work, this should be re-

    garded as a technicality necessary to tune the perfor-mance of the experimental analog device and we includethe results for completeness. Introductory details aboutthe procedure can be found in [6, 11]. In Figure 3-cwe show a characterization of the ensemble of JSP in-stances (parametrized by N , M , θ, and Pp, as describedat the beginning of this section). We present the bestferromagnetic couplings found by runs on the D-Wavemachine under the simplification of a uniform ferromag-netic coupling by solving the embedded problems withvalues of JF from 0.4 to 1.8 in relative energy units of thelargest coupling of the original Ising problem. The runparameters used to determine the best JF are the sameas we report in the following sections, and the problemsets tested correspond to Hamiltonians whose timespanis equal to the sought makespan HT=T . This parameter-setting approach is similar to the one followed in [6] foroperational planning problems, where the instance en-sembles were classified by problem size before compila-tion. What emerges from this preliminary analysis isthat each parametrized ensemble can be associated to adistribution of optimal JF that can be quite wide, espe-cially for the ensembles with pmin = 0 and large pmax.This spread might discourage the use of the mean valueof such a distribution as a predictor of the best JF touse for the embedding of new untested instances. How-ever, the results from this predictor appear to be betterthan the more intuitive prediction obtained by correlat-ing the number of qubits after compilation with the op-timal JF . This means that for the D-Wave machine toachieve optimal performance on structured problems, itseems to be beneficial to use the information contained inthe structure of the logical problem to determine the bestparameters. We note that this “offline” parameter-settingcould be used in combination with “online” performanceestimation methods such as the ones described in [28] inorder to reach the best possible instance-specific JF witha series of quick experimental runs. The application ofthese techniques, together with the testing of alternativeoffline predictors, will be the subject of future work.

    V. V. RESULTS OF TEST RUNS ANDDISCUSSION

    A complete quantum annealing JSP solver designed tosolve an instance to optimality using our proposed for-mulation will require the independent solution of severalembedded instances {HT }, each corresponding to a dif-ferent timespan T . Assuming that the embedding time,the machine setup time, and the latency between subse-quent operations can all be neglected, due to their beingnon-fundamental, the running time T of the approachfor a specific JSP instance reduces to the expected totalannealing time necessary to find the optimal solution ofeach embedded instance with a specified minimum targetprobability ' 1. The probability of ending the anneal-ing cycle in a desired ground state depends, in an essen-

  • 7

    tially unknown way, on the embedded Ising Hamiltonianspectrum, the relaxation properties of the environment,the effect of noise, and the annealing profile. Under-standing through an ab initio approach what is the bestcomputational strategy appears to be a formidable un-dertaking that would require theoretical breakthroughsin the understanding of open-system quantum anneal-ing [29, 30], as well as a tailored algorithmic analysisthat could take advantage of the problem structure thatthe annealer needs to solve. For the time being, and forthe purposes of this work, it seems much more practicalto limit these early investigations to the most relevantinstances, and to lay out empirical procedures that workunder some general assumptions. More specifically, wefocus on solving the CSP version of JSP, not the fulloptimization problem, and we therefore only benchmarkthe Hamiltonians with T = T with the D-Wave machine.We note however that a full optimization solver can berealized by leveraging data analysis of past results onparametrized ensembles and by implementing an adap-tive binary search. Full details can be found in a longerversion of this work [31].

    On the quantum annealer installed at NASA Ames (ithas 509 working qubits; details are presented in [32]),we run hundreds of instances, sampling the ensemblesN = M ∈ {3, 4, 5, 6}, θ ∈ {0.5, 1}, and Pp ∈ {[1, 1], [0, 2]}.For each instance, we report results, such as runtimes,at the most optimal JF among those tested, assumingthe application of an optimized parameter-setting pro-cedure along the lines of that described in the previoussection. Figure 4-a displays the total annealing repeti-tions required to achieve a 99% probability of successon the ground state of HT , with each repetition lastingtA = 20 µs, as a function of the number of qubits in theembedded (and pruned) Hamiltonian. We observe an ex-ponential increase in complexity with increasing Hamil-tonian size, for both classes of problems studied. Thislikely means that while the problems tested are small, theanalog optimization procedure intrinsic to the D-Wavedevice’s operation is already subject to the fundamentalcomplexity bottlenecks of the JSP. It is, however, pre-mature to draw conclusions about performance scalingof the technology given the current constraints on cali-bration procedures, annealing time, etc. Many of theseproblems are expected to be either overcome or nearly sowith the next generation of D-Wave chip, at which pointmore extensive experimentation will be warranted.

    In Figure 4-b, we compare the performance of theD-Wave device to two exhaustive classical algorithms inorder to gain insight on how current quantum anneal-ing technology compares with paradigmatic classical op-timization methods. Leaving the performance of approx-imate solutions for future work, we chose not to explorethe plethora of possible heuristic methods as we operatethe D-Wave machine, seeking the global optimum.

    The first algorithm, B, detailed in [35], exploits the dis-junctive graph representation and a branch-and-boundstrategy that very effectively combines a branching

    10-5 10-3 10-2 10-110-5

    10-4

    10-3

    10-2

    10-1

    Cla

    ssic

    al s

    olve

    r tim

    e (s

    )

    0 100 200 300 4001

    10

    100

    1000

    104

    105

    Number of qubits

    Anne

    alin

    g cy

    cles

    b)a)

    D-Wave machine annealing time (s)10-4

    FIG. 4: a) Number of repetitions required to solve HT withthe D-Wave Two with a 99% probability of success (see theappendix). The blue points indicate instances with θ = 1and yellow points correspond to θ = 0.5 (they are the sameinstances and runtimes used for Figure 3-c). The number ofqubits on the x-axis represents the qubits used after embed-ding. b) Correlation plot between classical solvers and theD-Wave optimizer. Gray and violet points represent runtimescompared with algorithm B, and cyan and red are comparedto the MS algorithm, respectively, with θ = 1 and θ = 0.5.All results presented correspond to the best out of 5 gaugesselected randomly for every instance. In case the machinereturns embedding components whose values are discordant,we apply a majority voting rule to recover a solution withinthe logical subspace [6, 11, 28, 33, 34]. We observe a devi-ation of about an order of magnitude on the annealing timeif we average over 5 gauges instead of picking the best one,indicating that there is considerable room for improvement ifwe were to apply more-advanced calibration techniques [32].

    scheme based on selecting the direction of a single dis-junctive edge (according to some single-machine con-straints), and a technique introduced in [36] for fixingadditional disjunctions (based on a preemptive relax-ation). It has publicly available code and is considereda well-performing complete solver for the small instancescurrently accessible to us, while remaining competitivefor larger ones even if other classical approaches becomemore favorable [37]. B has been used in [38] to discussthe possibility of a phase transition in the JSP, demon-strating that the random instances with N = M are par-ticularly hard families of problems, not unlike what isobserved for the quantum annealing implementation ofplanning problems based on graph vertex coloring [9].

    The second algorithm, MS, introduced in [20], proposesa time-based branching scheme where a decision is madeat each node to either schedule or delay one of the avail-able operations at the current time. The authors thenrely on a series of shaving procedures such as those pro-posed by [21] to determine the new bound and whetherthe choice leads to valid schedules. This algorithm is anatural comparison with the present quantum annealingapproach as it solves the decision version of the JSP in avery similar fashion to the time-indexed formulation wehave implemented on the D-Wave machine, and it makes

  • 8

    use of the same shaving technique that we adapted asa pre-processing step for variable pruning. However, weshould mention that the variable pruning that we im-plemented to simplify HT is employed at each node ofthe classical branch and bound algorithm, so the overallcomputational time of MS is usually much more importantthan our one-pass pre-processing step, and in general itsruntime does not scale polynomially with the problemsize.

    What is apparent from the correlation plot inFigure 4-b is that the D-Wave machine is easily outper-formed by a classical algorithm run on a modern single-core processor, and that the problem sizes tested in thisstudy are still too small for the asymptotic behaviorof the classical algorithms to be clearly demonstratedand measured. The comparison between the D-Wavemachine’s solution time for HT and the full optimiza-tion provided by B is confronting two very different algo-rithms, and shows that B solves all of the full optimiza-tion problems that have been tested within milliseconds,whereas D-Wave’s machine can sometimes take tenths ofa second (before applying the multiplier factor ' 2, dueto the binary search; see the appendix). When largerchips become available, however, it will be interesting tocompare B to a quantum annealing solver for sizes con-sidered B-intractable due to increasing memory and timerequirements.

    The comparison with the MS method has a promisingsignature even now, with roughly half of the instancesbeing solved by D-Wave’s hardware faster than the MS al-gorithm (with the caveat that our straightforward imple-mentation is not fully optimized). Interestingly, the dif-ferent parametrized ensembles of problems have distinc-tively different computational complexity characterizedby well-recognizable average computational time to solu-tion for MS (i.e., the points are “stacked around horizon-tal lines” in Figure 4-b), whereas the D-Wave machine’scomplexity seems to be sensitive mostly to the total qubitcount (see Figure 4-a) irrespective of the problem class.We emphasize again that conclusions on speedup andasymptotic advantage still cannot be confirmed until im-proved hardware with more qubits and less noise becomesavailable for empirical testing.

    VI. VI. CONCLUSIONS

    Although it is probable that the quantum annealing-based JSP solver proposed herein will not prove compet-itive until the arrival of an annealer a few generationsaway, the implementation of a provably tough applica-tion from top to bottom was missing in the literature,and our work has led to noteworthy outcomes we ex-pect will pave the way for more advanced applicationsof quantum annealing. Whereas part of the attractionof quantum annealing is the possibility of applying themethod irrespective of the structure of the QUBO prob-lem, we have shown how to design a quantum annealing

    I. Problem / instance parametrization

    II. Ensemble pre-characterization(software)

    V. Ensemble pre-characterization(hardware)

    III. Choice of mapping

    IV. Pre-processing

    VI. Embedding strategy

    VII. Running strategy

    VIII. Decoding and analysis

    FIG. 5: I–II) Appropriate choice of benchmarking and clas-sical simulations is discussed in Section IV. III–IV) Mappingto QUBO problems is discussed in Sections II and III. V–VI) Pre-characterization for parameter setting is described inSection VI. VII) Structured run strategies adapted to spe-cific problems have not to our knowledge been discussed be-fore. We discuss a prescription in the appendix. VIII) Theonly decoding required in our work is majority voting withinembedding components to recover error-free logical solutions.The time-indexed formulation then provides QUBO problemsolutions that can straightforwardly be represented as Ganttcharts of the schedules.

    solver, mindful of many of the peculiarities of the an-nealing hardware and the problem at hand, for improvedperformance. Figure 5 shows a schematic view of thestreamlined solving process describing a full JSP opti-mization solver. The pictured scheme is not intended tobe complete, for example, the solving framework can ben-efit from other concepts such as performance tuning tech-niques [28] and error-correction repetition lattices [39].The use of the decision version of the problem can becombined with a properly designed search strategy (thesimplest being a binary search) in order to be able to seekthe minimum value of the common deadline of feasibleschedules. The proposed timespan discrimination fur-ther provides an adjustable compromise between the fulloptimization and decision formulations of the problems,allowing for instant benefits from future improvementsin precision without the need for a new formulation oradditional binary variables to implement the makespanminimization as a term in the objective function. Aswill be explored further in future work, we found thatinstance pre-characterization performed to fine tune thesolver parameters can also be used to improve the searchstrategy, and that it constitutes a tool whose use we ex-pect to become common practice in problems amenableto CSP formulations as the ones proposed for the JSP.Additionally, we have shown that there is great potential

  • 9

    in adapting classical algorithms with favorable polyno-mial scaling as pre-processing techniques to either prunevariables or reduce the search space. Hybrid approachesand metaheuristics are already fruitful areas of researchand ones that are likely to see promising developmentswith the advent of these new quantum heuristics algo-rithms.

    A. Acknowledgements

    The authors would like to thank J. Frank, M. Do, E.G.Rieffel, B. O’Gorman, M. Bucyk, P. Haghnegahdar, andother researchers at QuAIL and 1QBit for useful inputand discussions. This research was supported by 1QBit,Mitacs, NASA (Sponsor Award Number NNX12AK33A),and by the Office of the Director of National Intelligence(ODNI), Intelligence Advanced Research Projects Activ-ity (IARPA), via IAA 145483 and by the AFRL Infor-mation Directorate under grant F4HBKC4162G001.

    Appendix A: JSP and QUBO formulation

    In this appendix we expand on the penalty form usedfor the constraints and the alternative reward-base for-mulation as well as the timespan discrimination scheme.

    1. Penalties versus rewards formulation

    The encoding of constraints as terms in a QUBO prob-lem can either reward the respecting of these constraintsor penalize their violation. Although the distinction mayat first seem artificial, the actual QUBO problem gener-ated differs and can lead to different performance on animperfect annealer. We present one such alternative for-mulation where the precedence constraint (7) is insteadencoded as a reward for correct ordering by replacing+ηh1(x̄) with −η′h′1(x̄), where

    h′1(x̄) =∑n

    ∑kn−1

  • 10

    the makespan depends on the completion time of the lastoperation, the precedence constraint guaranties that themakespan of a valid solution will be equal to the com-pletion time of one of those operations. We can thenselect the local field appropriately as a function of thetime index t to penalize a fixed number K of the largermakespans ranging from T −K+1 to T . Within a sectorassigned to the time step T , we need to further divide∆ET by the maximum number of operations that cancomplete at T to obtain the largest value we can use asthe local field hT , i.e., the number of distinct machinesused by at least one operation in the set of operations{Ok1 , . . . , OkN }, denoted by Mfinal. If K is larger than1, we also need to ensure that contributions from varioussectors can be differentiated. The objective is to assign adistinct T -dependent energy range to all valid scheduleswith makespans within [T−K, T ]. More precisely, we re-late the local fields for various sectors with the recursiverelation

    hT −1 =hTMfinal

    + �, (A4)

    where � is the minimum logical energy resolvable by theannealer. Considering that � is also the minimum localfield we can use for hT −K+1 and that the maximum totalpenalty we can assign through this time-discriminationprocedure is ∆E − �, it is easy to see that the energyresolution should scale as ∆E/(MKfinal). An example ofthe use of local fields for timespan discrimination is shownin Figure 1-d of the main text for the case K = 1.

    Appendix B: Computational strategy

    This appendix elaborates on the compilation methodsand the quantum annealing implementation of a full op-timization solver based on the decision version and a bi-nary search as outlined in the main text.

    1. Compilation

    The process of compiling, or embedding, an instancefor a specific target architecture is a crucial step giventhe locality of the programmable interactions on currentquantum annealer architectures. During the graph-minortopological embedding, each vertex of the problem graphis mapped to a subset of connected vertices, or subgraph,of the hardware graph. These assignments must be suchthat the the edges in the problem graph have at least onecorresponding edge between the associated subgraphs inthe hardware graph. Formally, the classical HamiltonianEq. (6) is mapped to a quantum annealing Ising Hamil-tonian on the hardware graph using the set of equa-tions that follows. The spin operators s~σi are definedby setting s = 1 and using the Pauli matrices to write~σi = (σ

    xi , σ

    yi , σ

    zi ). The resulting spin variables σzi = ±1,

    our qubits, are easily converted to the usual binary vari-ables xi = 0, 1 with σzi = 2xi−1. The Ising Hamiltonianis given by

    H = A(t) [HQ +HE ] +B(t)HD, (B1)

    HQ =∑ij

    Jijσzαiσ

    zβj

    ∣∣(αi,βj)∈E(i,j)

    +∑

    ik∈V (i)

    hiNV (i)

    σzk,(B2)

    HE = −∑

    i(k,k′)∈E(i,i)

    JFi,k,k′σzkσ

    zk′ , (B3)

    HD =∑

    ik∈V (i)

    σxk , (B4)

    where for each logical variable index i we have a cor-responding ensemble of qubits given by the set of ver-tices V (i) in the hardware graph with |V (i)| = NV (i).Edges between logical variables are denoted E(i, j) andedges within the subgraph of V (i) are denoted E(i, i).The couplings Jij and local fields hi represent the logi-cal terms obtained after applying the linear QUBO-Isingtransformation to Eq. (6). JFi,k,k′ are embedding param-eters for vertex V (i) and (k, k′) ∈ E(i, i) (see discus-sion below on the ferromagnetic coupling). The equationabove assumes that a local field hi is distributed uni-formly between the vertices V (i) and the coupling Ji,jis attributed to a single randomly selected edge (αi, βj)among the available couplers E(i, j), but other distribu-tions can be chosen. In the actual hardware implementa-tion we rescale the Hamiltonian by dividing by JF , whichis the value assigned to all JFi,k,k′ , as explained below.This is due to the limited range of Jij and hi allowed bythe machine [11].

    Once a valid embedding is chosen, the ferromagneticinteractions JFi,k,k′ in Eq. (B3) need to be set appropri-ately. While the purpose of these couplings is to penalizestates for which 〈σzk〉 6= 〈σzk′〉 for k, k′ ∈ V (i), settingthem to a large value negatively affects the performanceof the annealer due to the finite energy resolution of themachine (given that all parameters must later be rescaledto the actual limited parameter range of the solver) andthe slowing down of the dynamics of the quantum systemassociated with the introduction of small energy gaps.There is guidance from research in physics [11, 40] andmathematics [41] on which values could represent the op-timal JFi,k,k′ settings, but for application problems it iscustomary to employ empirical prescriptions based onpre-characterization [6] or estimation techniques of per-formance [28].

    Despite embedding being a time-consuming classicalcomputational procedure, it is usually not consideredpart of the computation and its runtime is not measuredin determining algorithmic complexity. This is becausewe can assume that for parametrized families of problemsone could create and make available a portfolio of embed-dings that are compatible with all instances belonging toa given family. The existence of a such a library would re-duce the computational cost to a simple query in a lookup

  • 11

    table, but this could come at the price of the availableembedding not being fully optimized for the particularproblem instance.

    2. Quantum annealing optimization solver

    We now detail our approach to solving individual JSPinstances. We shall assume the instance at hand canbe identified as belonging to a pre-characterized familyof instances for minimal computational cost. This caninvolve identifying N , M , and θ, as well as the approxi-mate distribution of execution times for the operations.The pre-characterization is assumed to include a statisti-cal distribution of optimal makespans as well as the ap-propriate solver parameters (JF, optimal annealing time,etc.). Using this information, we need to build an ensem-ble of queries Q = {q} to be submitted to the D-Wavequantum annealer to solve a problem H. Each element ofQ is a triple (tA, R, T ) indicating that the query consid-ers R identical annealings of the embedded HamiltonianHT for a single annealing time tA. To determine the el-ements in Q we first make some assumptions, namely,i) sufficient statistics: for each query, R is sufficientlylarge to sample appropriately the ensembles defined inEqs. (B7)–(B9); ii) generalized adiabaticity: tA is opti-mized (over the window of available annealing times) forthe best annealing performance in finding a ground stateof HT (i.e., the annealing procedure is such that the to-tal expected annealing time tAR? required to evolve to aground state is as small as possible compared to the timerequired to evolve to an excited state, with the sameprobability). Both of these conditions can be achieved inprinciple by measuring the appropriate optimal param-eters R?(q) and t?A(q) through extensive test runs overrandom ensembles of instances. However, we note thatverifying these assumptions experimentally is currentlybeyond the operational limits of the D-Wave Two devicesince the optimal tA for generalized adiabaticity is ex-pected to be smaller than the minimum programmablevalue [3]. Furthermore, we deemed the considerable ma-chine time required for such a large-scale study too oner-ous in the context of this initial foray. Fortunately, thefirst limitation is expected to be lifted with the next gen-eration of chip, at which point nothing would preventthe proper determination of a family-specific choice ofR? and t?A. Given a specified annealing time, the num-ber of anneals is determined by specifying r0, the targetprobability of success for queries or confidence level, andmeasuring rq, the rate of occurrence of the ground stateper repetition for the following query:

    R? =log[1− r0]log[1− rq]

    . (B5)

    The rate rq depends on the family, T , and theother parameters. The minimum observed during pre-characterization should be used to guarantee the groundstate is found with at least the specified r0. Formally,

    the estimated time to solution of a problem is then givenby

    T =∑q∈Q

    tA

    (log[1− r0]log[1− rq]

    ). (B6)

    The total probability of success of solving the problemin time T will consequently be

    ∏q rq. For the results

    presented in this paper, we used R? = 500 000 and t?A =min(tA) = 20 µs.

    We can define three different sets of qubit configura-tions that can be returned when the annealer is queriedwith q. E is the set of configurations whose energy islarger than ∆E as defined in Section III of the paper.These configurations represent invalid schedules. V is theset of solutions with zero energy, i.e., the solutions whosemakespan T is small enough (T ≤ T − K) that theyhave not been discriminated by the procedure describedin the subsection on timespan discrimination. Finally,S is the set of valid solutions that can be discriminated(T ∈ (T −K,T ]). Depending on what the timespan T ofthe problem Hamiltonian HT and the optimal makespanT are, the quantum annealer samples the following con-figuration space (reporting R samples per query):

    T < T −→ V,S = ∅ → E0 > ∆E, (B7)T ∈ (T −K,T ] −→ V = ∅ → E0 ∈ (0,∆E],

    (B8)T ≤ T −K −→ E ,V,S 6= ∅ → E0 = 0. (B9)

    Condition (B8) is the desired case where the groundstate of HT with energy E0 corresponds to a valid sched-ule with the optimal makespan we seek. The groundstates corresponding to conditions (B7) and (B9) areinstead, respectively, invalid schedules and valid sched-ules whose makespan could correspond to a global mini-mum (to be determined by subsequent calls). The above-described assumption ii) is essential to justify abortingthe search when case (B8) is encountered. If R and tAare appropriately chosen, the ground state will be prefer-entially found instead of all other solutions such that onecan stop annealing reasonably soon (i.e., after a numberof reads on the order of R?) in the absence of the appear-ance of a zero-energy solution. We can then declare thisminimum-energy configuration, found within (0,∆E], tobe the ground state and the associated makespan andschedule to be the optimal solution of the optimizationproblem. On the other hand, we note that if K = 0,a minimum of two calls are required to solve the prob-lem to optimality, one to show that no valid solution isfound for T = T − 1 and one to show that a zero-energyconfiguration is found for T = T . While for cases (B8)and (B9) the appearance of an energy less than or equalto ∆E heuristically determines the trigger that stops theannealing of HT , for case (B7), we need to have a pre-scription, based on pre-characterization, on how long toanneal in order to be confident that T < T . While op-timizing these times is a research program on its own

  • 12

    that requires extensive testing, we expect that the char-acteristic time for achieving condition (B8) when T = Twill be on the same order of magnitude for this unknownruntime.

    3. Search strategy

    The final important component of the computationalstrategy consists in determining the sequence of times-pans of the calls (i.e., the ensemble Q). Here we pro-pose to select the queries based on an optimized binarysearch that makes informed dichotomic decisions basedon the pre-characterization of the distribution of optimalmakespans of the parametrized ensembles as describedin the previous sections. More specifically, the search isdesigned based on the assumption that the JSP instanceat hand belongs to a family whose makespan distributionhas a normal form with average makespan 〈T 〉 and vari-ance σ2. This fitted distribution is the same Pp describedin Figure 2-a of the main text whose tails have been cutoff at locations corresponding to an instance-dependentupper bound Tmax and strict lower bound Tmin (see thefollowing section on bounds).

    Once the initial Tmin and Tmax are set, the binarysearch proceeds as follows. To ensure a logarithmic scal-ing for the search, we need to take into account the nor-mal distribution of makespans by attempting to bisectthe range (Tmin, Tmax] such that the probability of find-ing the optimal makespan on either side is roughly equal.In other words, T should be selected by solving the fol-lowing equation and rounding to the nearest integer:

    erf(Tmax +

    12 − 〈T 〉

    σ√

    2

    )+ erf

    (Tmin +

    12 − 〈T 〉

    σ√

    2

    )=

    (B10)

    erf(T + 12 − 〈T 〉

    σ√

    2

    )+ erf

    (T −max(1, K) + 12 − 〈T 〉

    σ√

    2

    ),

    where erf(x) is the error function. For our current pur-pose, an inexpensive approximation of the error functionis sufficient. In most cases this condition means initial-izing the search at T = 〈T 〉. We produce a query q0for the annealing of HT . If no schedule is found (con-dition (B7)) we simply let Tmin = T . If condition (B9)is verified, on the other hand, we update Tmax to be themakespan T of the valid found solution (which is equal toT −max(1, K) + 1 in the worst case) for the determina-tion of the next query q1. The third condition (B8), onlyreachable if K > 0, indicates both that the search canstop and the problem has been solved to optimality. Thesearch proceeds in this manner by updating the boundsand bisecting the new range at each step and stops eitherwith condition (B8) or when T = Tmax = Tmin+1. Figure6-a shows an example of such a binary search in practice.The reason for using this guided search is that the averagenumber of calls to find the optimal makespan is dramati-cally reduced with respect to a linear search on the range

    (Tmin, Tmax]. For a small variance this optimized searchis equivalent to a linear search that starts near T = 〈T 〉.A more spread-out distribution, on the other hand, willsee a clear advantage due to the logarithmic, instead oflinear, scaling of the search. In Figure 6-b, we computethis average number of calls as a function of N , θ, and Kfor N = M instances generated such that an operation’saverage execution time also scales with N . This last con-dition ensures that the variance of the makespan growslinearly with N as well, ensuring that the logarithmicbehavior becomes evident for larger instances. For thiscalculation we use the worst case when updating Tmaxdue to condition (B9) being met. We find that for theexperimentally testable instances with the D-Wave Twodevice (see Figure 3-b of the main text), the expectednumber of calls to solve the problem is less than three(in the absence of pre-characterization it would be twicethat), while for larger instances the size of Q scales log-arithmically, as expected.

    15 20 30 350.000.050.100.150.200.250.30

    1 2 3

    a)

    Prob

    abili

    ty

    � �� �

    � � �� � �

    � � � �� � � � � �

    � � � �

    ��

    � �� �

    � � �� � �

    � � � � �� � � � � �

    ��

    �� �

    � �� � �

    � � �� � � �

    � � � � �� �

    ��

    � �� �

    � �� �

    � � �� � � �

    � � � � �� �

    0 10 20 30 40 501

    2

    5

    10

    20

    ��

    �� �

    � �� � �

    � � � �� � � � � �

    � � �

    ��

    �� �

    � �� � �

    � � � �� � � � � �

    � � �

    ��

    �� �

    � �� � �

    � � � �� � � � � �

    � �

    ��

    �� �

    � �� � �

    � � � �� � � � � �

    � �

    Gaussian distribution

    0

    1

    23

    Problem size N = M = 2

    0 10 20 30 40 50

    Problem size N = M = 2< p >

    Aver

    age

    num

    ber o

    f que

    ries

    b)

    01

    2

    3

    1

    2

    5

    10

    25Optimal makespan

    20

    uniform distribution

    FIG. 6: a) View of a guided binary search required to iden-tify the minimum makespan over a distribution. The fittedexample distribution corresponds to N = M = 8, fitted to aGaussian distribution as described in the main text. We as-sume K = 1. The first attempt queries H26, the second H29,and the third H30 (the final call), following Eq. (B10). b) Av-erage number of calls to the quantum annealer required by thebinary search assuming Eq. (B10) (left panel) or assuming auniform distribution of minimum makespans between trivialupper and lower bounds. Thick and dashed lines correspondto θ = 1 and θ = 0.5, respectively, and the numeric values as-sociated with each color in the figure correspond to differentvalues of K. The operations’ execution times are distributeduniformly with Pp = {0, . . . , N/2}.

  • 13

    4. JSP bounds

    The described binary search assumes that a lowerbound Tmin and an upper bound Tmax are readily avail-able. We cover their calculation for the sake of complete-ness. The simplest lower bounds are the job bound andthe machine bound. The job bound is calculated by find-ing the total execution time of each job and keeping thelargest one of them, put simply

    maxn∈{1, ..., N}

    kn∑i=kn−1

    pi, (B11)

    where we use the lexicographic index i for operationsand where k0 = 1. Similarly, we can define the machinebound as

    maxm∈{1, ...,M}

    ∑i∈Im

    pi, (B12)

    where Im is the set of indices of all operations that needto run on machine mm. Since these bounds are inex-pensive to calculate, we can take the larger of the two.An even better lower bound can be obtained using theiterated Carlier and Pinson (ICP) procedure described inthe window shaving subsection of Section III of the maintext. We mentioned that the shaving procedure can showthat a timespan does not admit a solution if a windowcloses completely. Using shaving for different timespansand performing a binary search, we can obtain the ICPlower bound in O

    (N2 log(N)M2Tmax log2(Tmax−Tmin)

    ),

    where Tmin and Tmax are some trivial lower and upperbound, respectively, such as the ones described in thissection. Given that the search assumes a strict bound,we need to decrease whichever bound we chose here byone.

    As for the upper bound, an excellent choice is providedby another classical algorithm developed by Applegateand Cook [42] for some finite computational effort. Thestraightforward alternative is given by the total work ofthe problem ∑

    i∈{1, ..., kN}

    pi. (B13)

    The solver’s limitations can also serve to establish prac-tical bounds for the search. For a given family of prob-lems, if instances of a specific size can only be embeddedwith some acceptable probability for timespans smallerthan T embedmax , the search can be set with this limit, andfailure to find a solution will result in T embedmax , at whichpoint the solver will need to report a failure or switch toanother classical approach.

    Appendix C: Classical algorithms

    When designing a quantum annealing solver, a sur-vey of classical methods provides much more than a

    benchmark for comparison and performance. Classi-cal algorithms can sometimes be repurposed as usefulpre-processing techniques as demonstrated with variablepruning. We provide a quick review of the classical meth-ods we use for this work as well as some details on theclassical solvers to which we compare.

    1. Variable pruning

    Eliminating superfluous variables can greatly help mit-igate the constraints on the number of qubits available.Several elimination rules are available and we explain be-low in more detail the procedure we used for our tests.

    The first step in reducing the processing windows is toeliminate unneeded variables by considering the prece-dence constraints between the operations in a job, some-thing we refer to as simple variable pruning. We defineri as the sum of the execution times of all operations pre-ceding operation Oi. Similarly, we define qi as the sum ofthe execution times of all operations following Oi. Thenumbers ri and qi are referred to as the head and tailof operation Oi, respectively. An operation cannot startbefore its head and must leave enough time after finish-ing to fit its tail, so the window of possible start times,the processing window, for operation Oi is [ri, T−pi−qi].

    If we consider the one-machine subproblems inducedon each machine separately, we can update the headsand tails of each operation and reduce the processingwindows further. For example, recalling that Ij is theset of indices of operations that have to run on machineMj , we suppose that a, b ∈ Ij are such that

    ra + pa + pb + qb > T.

    Then Oa must be run after Ob. This means that we canupdate ra with

    ra = max{ra, rb + pb}.

    We can apply similar updates to the tails because of thesymmetry between heads and tails. These updates areknown in the literature as immediate selections.

    Better updates can be performed by using ascendantsets, introduced by Carlier and Pinson in [36]. A subsetX ⊂ Ij is an ascendant set of c ∈ Ij if c 6∈ Ij and

    mina∈X∪{c}

    ra +∑

    a∈X∪{c}

    pa + mina∈X

    qa > T.

    If X is an ascendant set of c, then we can update rc with

    rc = max

    {rc, max

    X′⊂X

    [mina∈X′

    ra +∑a∈X′

    pj

    ]}.

    Once again, the symmetry implies that similar updatescan be applied to the tails.

    Carlier and Pinson in [21] provide an algorithm toperform all of the ascendant-set updates on Mj in

  • 14

    O(N log(N)), where N = |Ij |. After these updates havebeen carried out on a per-machine basis, we propagatethe new heads and tails using the precedence of the op-eration by setting

    ri+1 = max {ri+1, ri + pi} , (C1)

    qi = max {qi, qi+1 + pi+1} , (C2)

    for every pair of operations Oi and Oi+1 that belong tothe same job.

    After propagating the updates, we check again if anyascendant-set updates can be made, and repeat the cycleuntil no more updates are found. In our tests, we use animplementation similar to the one described in [21] to dothe ascendant-set updates.

    Algorithm 1 is pseudocode that describes the shav-ing procedure. Here, the procedure UpdateMachine(i)updates heads and tails for machine i in O(N log(N)),as described by Carlier and Pinson in [21]. It returnsTrue if any updates were made, and False otherwise.PropagateWindows() is a procedure that iterates over thetasks and checks that Eqs. (C1) and (C2) are satisfied,in O(NM).

    Algorithm 1 Shaving algorithm1: procedure icp_shave2: updated← True3: while updated do4: updated← False5: for i ∈ machines do6: updated← UpdateMachine(i) ∨ updated7: if updated then PropagateWindows()

    For each repetition of the outermost loop of Algorithm1, we know that there is an update on the windows, whichmeans that we have removed at least one variable. Sincethere are at most NMT variables, the loop will run atmost this many times. The internal for loop runs ex-actly M times and does work in O(N log(N)). Puttingall of this together, the final complexity of the shavingprocedure is O(N2M2T log(N)).

    2. Classical algorithm implementation

    Brucker et al.’s branch and bound method [35] remainswidely used due to its state-of-the-art status on smallerJSP instances and its competitive performance on largerones [43]. Furthermore, the original code is freely avail-able through ORSEP [44]. No attempt was made at opti-mizing this code and changes were only made to properlyinterface with our own code and time the results.

    Martin and Shmoys’ time-based approach [20] is lessclearly defined in the sense that no publicly availablestandard code could be found and because a numberof variants for both the shaving and the branch andbound strategy are described in the paper. As covered inthe section on shaving, we have chosen the O(n log(n))variants of heads and tails adjustments, the most effi-cient choice available. On the other hand, we have re-stricted our branch and bound implementation to thesimplest strategy proposed, where each node branchesbetween scheduling the next available operation (an op-eration that was not yet assigned a starting time) imme-diately or delaying it. Although technically correct, thesame schedule can sometimes appear in both branchesbecause the search is not restricted to active schedules,and unwarranted idle times are sometimes considered.According to Martin and Shmoys, the search strategycan be modified to prevent such occurrences, but thesechanges are only summarily described and we did notattempt to implement them. Other branching schemesare also proposed which we did not consider for thiswork. One should be careful when surveying the liter-ature for runtimes of a full-optimization version basedon this decision-version solver. What is usually reportedassumes the use of a good upper bound such as the oneprovided by Applegate and Cook [42]. The runtime toobtain such bounds must be taken into account as well.It would be interesting to benchmark this decision solverin combination with our proposed optimized search, butthis benchmarking we also leave for future work.

    Benchmarking of classical methods was performed onan off-the-shelf Intel Core i7-930 processor clocked at2.8 GHz.

    [1] M. W. Johnson, P. Bunyk, F. Maibaum, E. Tolkacheva,A. J. Berkley, E. M. Chapple, R. Harris, J. Johans-son, T. Lanting, I. Perminov, et al., SuperconductorScience and Technology 23, 065004 (2010), URL http://stacks.iop.org/0953-2048/23/i=6/a=065004.

    [2] S. Boixo, T. F. Ronnow, S. V. Isakov, Z. Wang,D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer,Nature Physics 10, 218 (2014), ISSN 1745-2473, URLhttp://dx.doi.org/10.1038/nphys2900.

    [3] T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V.Isakov, D. Wecker, J. M. Martinis, D. A. Li-dar, and M. Troyer, Science 345, 420 (2014),http://www.sciencemag.org/content/345/6195/420.full.pdf,

    URL http://www.sciencemag.org/content/345/6195/420.abstract.

    [4] C. C. McGeoch and C. Wang, in Proceedings of theACM International Conference on Computing Frontiers(ACM, New York, NY, USA, 2013), CF ’13, pp. 23:1–23:11, ISBN 978-1-4503-2053-5, URL http://doi.acm.org/10.1145/2482767.2482797.

    [5] V. N. Smelyanskiy, E. G. Rieffel, S. I. Knysh, C. P.Williams, M. W. Johnson, M. C. Thom, W. G. Macready,and K. L. Pudenz, arXiv:1204.2821 [quant-ph] (2012),1204.2821.

    [6] E. G. Rieffel, D. Venturelli, B. O’Gorman, M. B. Do,E. M. Prystay, and V. N. Smelyanskiy, Quantum Infor-

    http://stacks.iop.org/0953-2048/23/i=6/a=065004http://stacks.iop.org/0953-2048/23/i=6/a=065004http://dx.doi.org/10.1038/nphys2900http://www.sciencemag.org/content/345/6195/420.abstracthttp://www.sciencemag.org/content/345/6195/420.abstracthttp://doi.acm.org/10.1145/2482767.2482797http://doi.acm.org/10.1145/2482767.2482797

  • 15

    mation Processing 14, 1 (2015), ISSN 1570-0755, URLhttp://dx.doi.org/10.1007/s11128-014-0892-x.

    [7] I. Hen, J. Job, T. Albash, T. F. Rønnow, M. Troyer,and D. Lidar, arXiv:1502.01663 [quant-ph] (2015),1502.01663.

    [8] H. G. Katzgraber, F. Hamze, Z. Zhu, A. J. Ochoa, andH. Munoz-Bauza, Physical Review X 5, 031026 (2015),URL http://link.aps.org/doi/10.1103/PhysRevX.5.031026.

    [9] E. Rieffel, D. Venturelli, M. Do, I. Hen, and F. J.,Parametrized families of hard planning problems fromphase transitions (2014).

    [10] B. O’Gorman, E. G. Rieffel, D. Minh, and D. Venturelli,in ICAPS 2015, Research Workshop Constraint Satisfac-tion Techniques for Planning and Scheduling Problems(2015).

    [11] D. Venturelli, S. Mandrà, S. Knysh, B. O’Gorman,R. Biswas, and V. Smelyanskiy, Physical Review X 5,031040 (2015), 1406.7553.

    [12] B. O’Gorman, R. Babbush, A. Perdomo-Ortiz, A. Aspuru-Guzik, and V. Smelyanskiy,The European Physical Journal Special Top-ics 224, 163 (2015), ISSN 1951-6355, URLhttp://dx.doi.org/10.1140/epjst/e2015-02349-9.

    [13] W.-Y. Ku and J. C. Beck (2014).[14] A. Jain and S. Meeran, European Journal of Op-

    erational Research 113, 390 (1999), ISSN 0377-2217, URL http://www.sciencedirect.com/science/article/pii/S0377221798001131.

    [15] E. Boros and P. L. Hammer, Discrete Applied Mathe-matics 123, 155 (2002), ISSN 0166-218X, URL http://dx.doi.org/10.1016/S0166-218X(01)00341-9.

    [16] C.-C. Cheng and S. F. Smith, Annals of Operations Re-search 70, 327 (1997).

    [17] A. Garrido, M. A. Salido, F. Barber, and M. López, inProc. ECAI-2000 Workshop on New Results in Planning,Scheduling and Design (PuK2000) (2000), pp. 44–49.

    [18] H. M. Wagner, Naval Research Logistics Quarterly 6,131 (1959), ISSN 1931-9193, URL http://dx.doi.org/10.1002/nav.3800060205.

    [19] A. S. Manne, Operations Research 8, 219 (1960),ISSN 0030364X, 15265463, URL http://www.jstor.org/stable/167204.

    [20] P. Martin and D. B. Shmoys, in Integer Programming andCombinatorial Optimization, edited by W. H. Cunning-ham, S. McCormick, and M. Queyranne (Springer BerlinHeidelberg, 1996), vol. 1084 of Lecture Notes in Com-puter Science, pp. 389–403, ISBN 978-3-540-61310-7,URL http://dx.doi.org/10.1007/3-540-61310-2_29.

    [21] J. Carlier and P. E., European Journal of Op-erational Research 78, 146 (1994), ISSN 0377-2217, URL http://www.sciencedirect.com/science/article/pii/0377221794903794.

    [22] L. Péridy and D. Rivreau, European Journal of Op-erational Research 164, 24 (2005), ISSN 0377-2217, URL http://www.sciencedirect.com/science/article/pii/S0377221703009019.

    [23] Y. Caseau and F. Laburthe, in Proc. of the 11th Inter-national Conference on Logic Programming (1994).

    [24] W. M. Kaminsky and S. Lloyd, in Quantum Comput-ing and Quantum Bits in Mesoscopic Systems, edited by

    A. J. Leggett, B. Ruggiero, and P. Silvestrini (SpringerUS, 2004), pp. 229–236, ISBN 978-1-4613-4791-0, URLhttp://dx.doi.org/10.1007/978-1-4419-9092-1_25.

    [25] V. Choi, Quantum Information Processing 10, 343(2011), ISSN 1570-0755, URL http://dx.doi.org/10.1007/s11128-010-0200-3.

    [26] I. Adler, F. Dorn, F. V. Fomin, I. Sau, and D. M. Thilikos,in Algorithm Theory - SWAT 2010, edited by H. Ka-plan (Springer Berlin Heidelberg, 2010), vol. 6139 ofLecture Notes in Computer Science, pp. 322–333, ISBN978-3-642-13730-3, URL http://dx.doi.org/10.1007/978-3-642-13731-0_31.

    [27] J. Cai, W. G. Macready, and A. Roy, arXiv:1406.2741[quant-ph] (2014), 1406.2741.

    [28] A. Perdomo-Ortiz, J. Fluegemann, R. Biswas, andV. N. Smelyanskiy, arXiv:1503.01083 [quant-ph] (2015),1503.01083.

    [29] S. Boixo, V. N. Smelyanskiy, A. Shabani, S. V. Isakov,M. Dykman, V. S. Denchev, M. Amin, A. Smirnov,M. Mohseni, and H. Neven, arXiv:1411.4036 [quant-ph](2014), 1411.4036.

    [30] V. N. Smelyanskiy, D. Venturelli, A. Perdomo-Ortiz,S. Knysh, and M. I. Dykman, arXiv:1511.02581 [quant-ph] (2015), 1511.02581.

    [31] D. Venturelli, D. J. J. Marchand, and G. Rojo,arXiv:1506.08479 [quant-ph] (2015), 1506.08479.

    [32] A. Perdomo-Ortiz, B. O’Gorman, J. Fluegemann,R. Biswas, and V. N. Smelyanskiy, arXiv:1503.05679[quant-ph] (2015), 1503.05679.

    [33] A. D. King and C. C. McGeoch, arXiv:1410.2628 [cs.DS](2014), 1410.2628.

    [34] K. L. Pudenz, T. Albash, and D. A. Lidar, NatureCommunications 5 (2014), URL http://dx.doi.org/10.1038/ncomms4243.

    [35] P. Brucker, B. Jurisch, and B. Sievers, DiscreteApplied Mathematics 49, 107 (1994), ISSN 0166-218X, URL http://www.sciencedirect.com/science/article/pii/0166218X94902046.

    [36] J. Carlier and E. Pinson, Annals of Operations Research26, 269 (1991), ISSN 0254-5330, URL http://dl.acm.org/citation.cfm?id=115185.115198.

    [37] J. C. Beck, T. K. Feng, and J.-P. Watson, INFORMSJournal on Computing 23, 1 (2011).

    [38] M. J. Streeter and S. F. Smith, Journal of Artificial In-telligence Research 26, 247 (2006).

    [39] W. Vinci, T. Albash, G. Paz-Silva, I. Hen, and D. A.Lidar, Physical Review A 92, 042310 (2015).

    [40] A. D. King, T. Lanting, and R. Harris, arXiv:1502.02098[quant-ph] (2015), 1502.02098.

    [41] V. Choi, Quantum Information Processing 7, 193 (2008),ISSN 1570-0755, URL http://dx.doi.org/10.1007/s11128-008-0082-9.

    [42] A. D. and C. W., ORSA J. Comput. 3, 149 (1991).[43] W. Brinkkötter and P. Brucker, Journal of

    Scheduling 4, 53 (2001), ISSN 1099-1425, URLhttp://dx.doi.org/10.1002/1099-1425(200101/02)4:13.0.CO;2-Y.

    [44] B. P., J. B., and S. B., European Journal of OperationalResearch 57, 132 (1992).

    http://dx.doi.org/10.1007/s11128-014-0892-xhttp://link.aps.org/doi/10.1103/PhysRevX.5.031026http://link.aps.org/doi/10.1103/PhysRevX.5.031026http://dx.doi.org/10.1140/epjst/e2015-02349-9http://www.sciencedirect.com/science/article/pii/S0377221798001131http://www.sciencedirect.com/science/article/pii/S0377221798001131http://dx.doi.org/10.1016/S0166-218X(01)00341-9http://dx.doi.org/10.1016/S0166-218X(01)00341-9http://dx.doi.org/10.1002/nav.3800060205http://dx.doi.org/10.1002/nav.3800060205http://www.jstor.org/stable/167204http://www.jstor.org/stable/167204http://dx.doi.org/10.1007/3-540-61310-2_29http://www.sciencedirect.com/science/article/pii/0377221794903794http://www.sciencedirect.com/science/article/pii/0377221794903794http://www.sciencedirect.com/science/article/pii/S0377221703009019http://www.sciencedirect.com/science/article/pii/S0377221703009019http://dx.doi.org/10.1007/978-1-4419-9092-1_25http://dx.doi.org/10.1007/s11128-010-0200-3http://dx.doi.org/10.1007/s11128-010-0200-3http://dx.doi.org/10.1007/978-3-642-13731-0_31http://dx.doi.org/10.1007/978-3-642-13731-0_31http://dx.doi.org/10.1038/ncomms4243http://dx.doi.org/10.1038/ncomms4243http://www.sciencedirect.com/science/article/pii/0166218X94902046http://www.sciencedirect.com/science/article/pii/0166218X94902046http://dl.acm.org/citation.cfm?id=115185.115198http://dl.acm.org/citation.cfm?id=115185.115198http://dx.doi.org/10.1007/s11128-008-0082-9http://dx.doi.org/10.1007/s11128-008-0082-9http://dx.doi.org/10.1002/1099-1425(200101/02)4:13.0.CO;2-Yhttp://dx.doi.org/10.1002/1099-1425(200101/02)4:13.0.CO;2-Y

    I I. IntroductionA Problem definition and conventionsB Quantum annealing formulation

    II II. QUBO problem formulationA ConstraintsB Simple variable pruning

    III III. QUBO formulation refinementsA Window shavingB Timespan discrimination

    IV IV. Ensemble pre-characterization and compilationA Makespan EstimationB Compilation

    V V. Results of test runs and discussionVI VI. ConclusionsA Acknowledgements

    A JSP and QUBO formulation1 Penalties versus rewards formulation2 Timespan discrimination

    B Computational strategy1 Compilation2 Quantum annealing optimization solver3 Search strategy4 JSP bounds

    C Classical algorithms1 Variable pruning2 Classical algorithm implementation

    References