dursun a. bulutoglu - wilfrid laurier university · dursun a. bulutoglu august 2, 2019 the views...

29
A Rank-1 Semidefinite Programming Formulation for Finding D-optimal Designs Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official policy or position of the Air Force, Department of Defense or the U.S. Government. 1

Upload: others

Post on 01-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Semidefinite Programming Formulation for FindingD-optimal Designs

Dursun A. BulutogluAugust 2, 2019

The views expressed in this presentation are those of the author and donot reflect the official policy or position of the Air Force, Department of

Defense or the U.S. Government.

1

Page 2: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Outline

• D-optimal factorial designs

• Hadamard’s determinant problem

• Legendre Pairs and two circulant core Hadamard matrices

• A rank-1 constrained semidefinite programming formulation

• The augmented Lagrangian method

2

Page 3: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Hadamard Determinant Problem

• For N experimental runs, k factors, and 2 levels per factor the mostefficient design for estimating all the main effects and the interceptmodel is an N × (k + 1) matrix X of ±1s such that det(XTX) ismaximized.

• The problem of finding such a matrix is known as the Hadamard de-terminant problem.

3

Page 4: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Hadamard Determinant Problem (cont.)

• By Hadamard’s inequality

det(XTX) ≤ N(k+1). (1)

• When k = N − 1 (k ≤ N − 1) equality in inequality (1) is achievableby a (partial) Hadamard matrix.

• An N × N (N ×m) matrix H of ±1s is called a (partial) Hadamardmatrix if

HTH = NIN (mIN),

where m ≤ N and IN is the N ×N identity matrix.

4

Page 5: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Hadamard Determinant Problem (cont.)

• For an N ×N matrix X of ±1s

det(XTX) ≤

NN , if N = 0 (mod 4),(2N − 1)(N − 1)(N−1), if N = 1 (mod 4),4(N − 1)2(N − 2)(N−2), if N = 2 (mod 4).

1. The N = 0 (mod 4) case is called the Hadamard bound.

2. The N = 1 (mod 4) case is called the Barba bound.

3. The N = 2 (mod 4) case is called the Ehlich-Wojtas bound.

4. For N = 3 (mod 4) there is a more complicated bound known asthe Ehlich bound.

5

Page 6: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Hadamard Determinant Problem (cont.)

The above bounds are not always achievable for N = 1,2,3 (mod 4).

1. Since det(XTX) = det(X)2, each of the above bound is achievable only if it isa perfect square. Hence, for N = 1 (mod 4), the Barba bound is achievable only if2N −1 is a perfect square. An infinite family achieving the Barba bound is known toexist.

2. For N = 2 (mod 4), Ehlich-Wojtas bound is achievable only if N − 1 is the sumof two perfect squares Ehlich (1964). Two infinite families achieving this bound areknown to exist.

Improving the determinant upper bound is accomplished on a case by casebasis and requires enumeration. For all the related references, see

www.indiana.edu/˜maxdet.

6

Page 7: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Circulant Matrices

• An n× n matrix is called circulant if its rows are obtained by cyclicallyshifting its first row to the right.

• An example:

A =

1 2 3 44 1 2 33 4 1 22 3 4 1

.

• We use the notation A = circ((1,2,3,4)) for A.

7

Page 8: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Circulant Matrices (cont.)

• If A is circulant, then ATA is also circulant.

• For circulant A we have

AAT = ATA.

• If A = circ(xT ), with x ∈ Rt, then ATA is calculated by using theperiodic autocorrelation function Px(s),

Px(s) =∑i

xixi+s (mod t) for s = 0,1, . . . , t− 1.

8

Page 9: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Legendre Pairs

Let t be odd, a, b ∈ {±1}t such that

t∑i=1

ai =t∑

i=1

bi = ±1,

and

Pa(s) + Pb(s) = −2 for s = 1,2, . . . , t− 1.

Then (a, b) is called a Legendre pair of length t, and

circ(aT )circ(aT )T + circ(bT )circ(bT )T = (2t+ 2)It − 2J t,

where J t is the t× t all 1s matrix.

9

Page 10: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Legendre Pairs (cont.)

• A Legendre pair has been constructed when

1. t is a prime, see Fletcher et.al. (2001);

2. t = p1(p1 + 2), with p2 = p1 + 2, where p1, p2 are odd primes,see Cathain and Stafford (2010);

3. 2t+ 1 is a prime power (by Szekeres), see Wallis et. al. (1972);

4. t = 2m − 1 for some positive integer m ≥ 2, Schroeder (1984).

• Each of these constructions use finite fields or Gauss sums from num-ber theory.

10

Page 11: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Legendre Pairs (cont.)

• Fletcher et. al. (2001) observed that the number of Legendre pairs oflength t grows exponentially with t.

• Hence, they conjectured that a Legendre pair exists for all odd t.

• Proving this conjecture proves the Hadamard conjecture via the fol-lowing two circulant core Hadamard matrix construction.

11

Page 12: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

Two Circulant Core Hadamard Matrices

Let (a, b) be a Legendre pair of length t. Then

H =

−1 −1 1 · · · 1 1 · · · 1−1 1 1 · · · 1 −1 · · · −1

1 1... ... circ(aT ) circ(bT )1 11 −1... ... circ(bT )T −circ(aT )T

1 −1

is a (2t+ 2)× (2t+ 2) Hadamard matrix, andH is called a two circulantcore Hadamard matrix.

12

Page 13: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs

Let (x,y) be a sought after length t Legendre pair. Let z = (xT ,yT )T

and Y = zzT . Then the following constraints characterize z∑i−j=r (mod t)

1≤i,j≤t

Yij +∑

i−j=r (mod t)t+1≤i,j≤2t

Yij = −2, for r = 1, . . . , t,

Yjj = 1, for j = 1, . . . ,2t,

and we get the following rank-1 constrained semidefinite programming for-mulation constraints for Legendre pairs.

13

Page 14: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

∑i−j=r (mod t)

1≤i,j≤t

Yij +∑

i−j=r (mod t)t+1≤i,j≤2t

Yij = −2, for r = 1, . . . , t, (2)

Yjj = 1, for j = 1, . . . ,2t,

Y � 0,

rank(Y ) = 1.

If the right hand side of the constraints (2) is replaced by 1 (2) then theresulting set of constraints characterize a D-optimal design achieving theBarba (Wojtas-Ehlich) bound.

14

Page 15: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

• The constraints

rank(Y ) = 1, and Yii = 1 for i = 1, . . . ,2t

imply that Y has one row up to multiplication by −1.

• The most difficult constraint is the rank(Y ) = 1 constraint. This prob-lem is non-convex due to the rank(Y ) = 1 constraint.

• Yet for any z = (x,T yT )T ∈ {−1,1}2t, Y = zzT is a solution tothe non-convex relaxation of this problem obtained by removing con-straints (2).

15

Page 16: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

• As in the MAX-CUT problem, the feasible set of the convex relaxationobtained by removing the rank(Y ) = 1 constraint is compact.

• This is true because every feasible matrix Y satisfies Tr(Y ) = 2t.

• A canonical solution Y with rank 2t (full rank) to the convex relaxationis given by

Yij =

1 if i = j,

−1t if i 6= j and i, j ∈ {1, . . . , t},−1t if i 6= j and i, j ∈ {t+ 1, . . . ,2t},0 otherwise.

16

Page 17: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

• By the results of Barvinok (1995) and Pataki (1998), the canonical so-lution Y and the compactness of the feasible set of the convex relax-ation implies that a solution with rank at most b−1/2+(

√1 + 32t)/2c

exists.

17

Page 18: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

• By using the principal character from algebra it can be shown that theconstraint

t∑j=1

Y1j =2t∑

j=t+1

Y(t+1)j = 1 or− 1

is valid for all feasible solutions to the original problem.

• So, the solutions to the problem can be divided into two types:

1. those for which∑tj=1 Y1j =

∑2tj=t+1 Y(t+1)j = 1;

2. those for which∑tj=1 Y1j =

∑2tj=t+1 Y(t+1)j = −1.

18

Page 19: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

A Rank-1 Constrained Semidefinite Programming Formulation forLegendre Pairs (cont.)

• The number of solutions is the same for both types.

• Fletcher et. al. (2001) observed that the number of rank-1 solutionsgrows exponentially with t.

• As in the MAX-CUT problem, these formulations may lead to new in-sights for the existence problem for Legendre pairs, Hadamard matri-ces, and D-optimal designs achieving the Barba bound as well as theEhlich-Wojtas bound.

19

Page 20: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Penalty Method

An optimization problem

min f(x)subject to ci(x) = 0 ∀i ∈ I

can be solved as a series of unconstrained minimization problems:

min Φk(x) = f(x) + µk∑i∈I

ci(x)2.

20

Page 21: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Penalty Method (cont.)

• The penalty method solves this problem for k = 1,2,3 . . ., where atthe k+ 1’th iteration it solves the problem using µk+1, where µk+1 >

µk > 0.

• The solution to the non-linear unconstrained optimization problem atthe kth step is a local minimum.

21

Page 22: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Penalty Method (cont.)

• The penalty method uses the solution from the k’th iteration as aninitial guess or “warm-start” for the k + 1’th iteration.

• As µk is increased the local optimums to the unconstrained problembecome feasible points of the original problem.

22

Page 23: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Augmented Lagrangian Method

• If x∗, λ∗ satisfy the sufficiency conditions of second-order for the orig-inal problem and µk is larger than a threshold, then x∗ is a strict localminimum of Φk(x, µk,λ

∗).

• This suggest that if we set λk close to λ∗ and do unconstrained mini-mization of Φk(x, µk,λk), then we can find x close to x∗.

23

Page 24: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Augmented Lagrangian Method (cont.)

The augmented Lagrangian method uses the following unconstrained ob-jective:

min Φk(x, µk,λk) = f(x) +µk2

∑i∈I

ci(x)2 +∑i∈I

λi,kci(x).

After each iteration, in addition to updating µk, the variables λi,k are alsoupdated according to the rule

λi,k+1 := λi,k + µkci(xk),

where xk is the solution to the unconstrained problem at the k’th step.

24

Page 25: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Augmented Lagrangian Method (cont.)

• The variables λi,k are estimates of the Lagrange multipliers, and theaccuracy of these estimates improve at every step.

• The major advantage of the method is that unlike the penalty method,it is not necessary to take µk → ∞ in order to solve the original con-strained problem.

• Because of the presence of the Lagrange multiplier term λk, µk canstay much smaller.

• This avoids the ill-conditioning of ∆2xxΦk(x, µk,λk).

25

Page 26: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Augmented Lagrangian Method (cont.)

• To overcome ill-conditioning:

1. Use Newton-like method (with double precision);

2. Use good starting points;

3. For λ0, if a good guess for a compact set where the optimal dualsolution λ∗ must reside is known, then choose some value there.Otherwise, set λ0 to 0;

4. Increase µk at a moderate rate. (A good practical scheme is to useµk = βk for β ∈ [5,10].)

26

Page 27: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

The Augmented Lagrangian Method (cont.)

• For details, see Bertsekas, D. (1996) Chapter 2 and Nocedal andWright (1999) 17.4 (1999).

• There is an implementation of the augmented Lagrangian method.

https://nlopt.readthedocs.io/en/latest/NLopt_Algorithms/

27

Page 28: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

References

• Bertsekas, D. (1996). Constrained Optimization and Lagrange Multiplier Methods. Athena-Scientific,Belmont Massachusetts.

• Barvinok, A. (1995). Problems of distance geometry and convex properties of quadratic maps. Dis-crete and Computational Geometry, 13, 189-202.

• Cathain P. O. and Stafford R. M. (2010). On twin prime power Hadamard matrices. Cryptogr. Com-mun., 2, 261-269.

• Ehlich, H. (1964). Determinantenabschatzungen fur binare Matrizen. Math. Z., 83, 123-132.

• Fletcher, R. J., Gysin, M. and Seberry, J. (2001). Application of the discrete Fourier transform to thesearch for generalised Legendre pairs and Hadamard matrices. Australas. J. Combin., 23, 75-86.

• Pataki, G. (1998). On the rank of extreme matrices in semidefinite programming programs and themultiplicity of optimal eigenvalues. Mathematics of operations research, 23, 339-358.

• Nocedal J., Wright S. (1999). Numerical Optimization. Springer-Verlag, New York.

28

Page 29: Dursun A. Bulutoglu - Wilfrid Laurier University · Dursun A. Bulutoglu August 2, 2019 The views expressed in this presentation are those of the author and do not reflect the official

References (cont.)

• Schroeder, W. D. (1984). Number Theory in Science and Communication. Springer-Verlag, NewYork.

• Wallis, W. D., Street, A. P. and Seberry, J. (1972). Combinatorics: Room Squares, Sum-free sets,Hadamard Matrices. Lecture Notes in Math., 292, Springer-Verlag, New York.

29