designing structured tight frames via an alternating projection...

1

Designing Structured Tight Framesvia an Alternating Projection Method

Joel A. Tropp,Student Member, IEEE, Inderjit S. Dhillon,Member, IEEE,Robert W. Heath Jr.,Member, IEEEand Thomas Strohmer

Abstract— Tight frames, also known as general Welch-Bound-Equality sequences, generalize orthonormal systems. Numer-ous applications—including communications, coding and sparseapproximation—require finite-dimensional tight frames that pos-sess additional structural properties. This paper proposes analternating projection method that is versatile enough to solvea huge class of inverse eigenvalue problems, which includes theframe design problem. To apply this method, one only needs tosolve a matrix nearness problem that arises naturally from thedesign specifications. Therefore, it is fast and easy to developversions of the algorithm that target new design problems.Alternating projection will often succeed even if algebraic con-structions are unavailable.

To demonstrate that alternating projection is an effective toolfor frame design, the article studies some important structuralproperties in detail. First, it addresses the most basic de-sign problem—constructing tight frames with prescribed vectornorms. Then, it discusses equiangular tight frames, which arenatural dictionaries for sparse approximation. Last, it examinestight frames whose individual vectors have low peak-to-average-power ratio (PAR), which is a valuable property for CDMAapplications. Numerical experiments show that the proposedalgorithm succeeds in each of these three cases. The appendicesinvestigate the convergence properties of the algorithm.

Index Terms— Frames, sequences, algorithms, eigenvalues andeigenfunctions, inverse problems, geometry, extremal problems,code division multiaccess

I. I NTRODUCTION

T IGHT FRAMES provide a natural generalization of or-thonormal systems, and they arise in numerous practical

and theoretical contexts [1]. There is no shortage of tightframes, and applications will generally require that the vectorscomprising the frame have some kind of additional structure.For example, it might be necessary for the vectors to havespecific Euclidean norms, or perhaps they should have smallmutual inner products. Thus arises a design problem: How doyou build a structured tight frame?

A. Contributions

To address the design question, this article proposes anumerical method based on alternating projection that builds

J. A. Tropp is with the Inst. for Computational Engineering and Sci-ences (ICES), The University of Texas at Austin, Austin, TX 78712 USA,<[email protected]> .

I. S. Dhillon is with the Dept. of Computer Sciences, The University ofTexas at Austin, Austin, TX 78712 USA,<[email protected]> .

R. W. Heath is with the Dept. of Electrical and Computer Engi-neering, The University of Texas at Austin, Austin, TX 78712 USA,<[email protected]> .

T. Strohmer is with the Dept. of Mathematics, The University of Californiaat Davis, Davis, CA 95616 USA,<[email protected]> .

on our work in [2], [3]. The algorithm alternately finds thenearest tight frame to a given ensemble of structured vectors;then it finds the ensemble of structured vectors nearest tothe tight frame; and it repeats the processad infinitum. Thistechnique is analogous to the method of projection on convexsets (POCS) [4], [5], except that the class of tight frames isnon-convex, which complicates the analysis significantly. Nev-ertheless, our alternating projection algorithm affords simpleimplementations, and it provides a quick route to solve difficultframe design problems. We argue that similar techniques applyto a huge class of inverse eigenvalue problems.

This article demonstrates the elegance and effectiveness ofthe alternating projection approach with several examples thatare motivated by applications. First, we address the most basicframe design problem—building tight frames with prescribedvector norms. This problem arises when constructing signa-ture signatures for direct-spread, synchronous code divisionmultiaccess (DS-CDMA) systems [6]–[8]. Second, we discussequiangular tight frames, which have the property that eachpair of distinct vectors meets at the same (acute) angle.These frames have many applications in coding and com-munications [9]–[12], and they also form natural dictionariesfor sparse approximation [13]–[15]. Third, we examine tightframes whose individual vectors have low peak-to-average-power ratio (PAR), which is another valuable property for DS-CDMA signatures [3]. Our experiments show that alternatingprojection outperforms some algorithms that were specificallydesigned to solve these problems.

The appendices investigate the convergence properties of thealgorithm. Although alternating projection between subspacesand convex sets has been studied in detail, very few resultsare available for an alternating projection between two non-convex sets. This paper provides a rigorous treatment of thealgorithm’s behavior in this general setting by means of thetheory of point-to-set maps. In particular, we establish a weakglobal convergence result, and we show that, under additionalhypotheses, the algorithm exhibits stronger local convergenceproperties.

Note that there is a major conceptual difference between theuse of finite models in the numerical calculation of infinite-dimensional frames and the design of finite-dimensionalframes. In the former case, the finite model plays only anauxiliary role in the approximate computation of an infinite-dimensional tight frame [1]. In the latter case, the problemunder consideration is already finite-dimensional, thus it doesnot involve discretization issues. In this paper, we consideronly finite-dimensional tight frames.

Citation

ICES Report 03-50, The University of Texas at Austin, December 2003. Submitted for publication, December 2003.

Copyright

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.

2

B. Previous Work

At root, finite-dimensional frame design is an algebraicproblem. It boils down to producing a structured matrixwith certainspectralproperties, which may require elaboratediscrete and combinatorial mathematics. In the past, mostdesign methods have employed these techniques. To appreciatethe breadth of this literature, one might peruse Sarwate’s recentsurvey paper about tight frames with unit-norm vectors [16].The last few years have also seen some essentially algebraicalgorithms that can construct tight frames with non-constantvector norms [7], [17], [18].

When algebraic methods work, they work brilliantly. Anumerical approach like alternating projection can hardly hopeto compete with the most profound insights of engineers andmathematicians. On the other hand, algebraic and combina-toric tools are not always effective. For example, we mightrequire a structured tight frame for a vector space whosedimension is not a prime-power. Even in these situations,alternating projection will usually succeed. Moreover, it canhelp researchers develop the insight necessary for completingan algebraic construction. The power of alternating projectioncomes from replacing the difficult algebra with a simple ana-lytic question: How does one find an ensemble of structuredvectors nearest to a given tight frame? This minimizationproblem can usually be dispatched with standard tools, suchas differential calculus or Karush–Kuhn–Tucker theory.

The literature does not offer many other numerical ap-proaches to frame design. It appears that most current al-gorithms can be traced to the discovery by Rupf–Massey[6] and Viswanath–Anantharam [7] that tight frames withprescribed column norms are the optimal sequences for DS-CDMA systems. This application prompted a long series ofpapers, including [19]–[23], that describe iterative methods forconstructing tight frames with prescribed column norms. Thesetechniques are founded on an oblique characterization of tightframes as the minimizers of a quantity calledtotal squared cor-relation (TSC). It is not clear how one could generalize thesemethods to solve different types of frame design problems.Moreover, the alternating projection approach that we proposesignificantly outperforms at least one of the TSC-minimizationalgorithms. Two of the algebraic methods that we mentionedabove, [7] and [18], were also designed with the DS-CDMAapplication in mind, while the third algebraic method [17]comes from thesoi-disantframe community. We are not awareof any other numerical methods for frame design.

C. Outline

Section II continues with a short introduction to tightframes. Then, in Section III, we state two formal framedesign problems. Connections among frame design problems,inverse eigenvalue problems and matrix nearness problems areestablished. This provides a natural segue to the alternatingprojection algorithm. Afterward, we apply the basic algorithmto design three different types of structured frames, in orderof increasing implementation difficulty. Section IV discussestight frames with prescribed column norms; Section V coversequiangular tight frames; and Section VI constructs tight

frames whose individual vectors have low peak-to-average-power ratio. Each of these sections contains numerical exper-iments. The body of the paper concludes with Section VII,which discusses the method, its limitations and its extensions.

The back matter contains the bulk of the analysis. AppendixI offers a tutorial on point-to-set maps, and Appendix IIapplies this theory to obtain a rigorous characterization ofthe algorithm’s convergence behavior. The first appendix alsocontains a brief survey of the alternating projection literature.

II. T IGHT FRAMES

This section offers a terse introduction to the properties oftight frames that are essential for our method. For more details,see [1], for example.

A. Frames

Let α and β be positive constants. Afinite frame forthe complex1 Hilbert spaceCd is a sequence ofN vectorsxnN

n=1 drawn fromCd that satisfies a generalized Parsevalcondition:

α ‖v‖22 ≤

N∑n=1

|〈v,xn〉|2 ≤ β ‖v‖22 for all v ∈ Cd. (1)

We denote the Euclidean inner product with〈·, ·〉, and we write‖·‖2 for the associated norm. The numbersα andβ are calledthe lower andupper frame bounds. The number of vectors inthe frame may be no smaller than the dimension of the space(i.e. N ≥ d).

If it is possible to takeα = β, then we have atight frameoran α-tight frame. When the frame vectors all have unit norm,i.e. ‖xn‖2 ≡ 1, the system is called aunit-norm frame. Unit-norm tight frames are also known asWelch-Bound-Equalitysequences[12], [24]. Tight frames with non-constant vectornorms have also been calledgeneral Welch-Bound-Equalitysequences[7].

B. Associated Matrices

Form ad×N matrix with the frame vectors as its columns:

X =[x1 x2 x3 . . . xN

].

This matrix is referred to as theframe synthesis operator, butwe shall usually identify the synthesis operator with the frameitself.

Two other matrices arise naturally in connection with theframe. We define theGram matrixasG

def= X ∗X . (The symbol∗ indicates conjugate transposition of matrices and vectors.)The diagonal entries of the Gram matrix equal the squarednorms of the frame vectors, and the off-diagonal entries of theGram matrix equal the inner products between distinct framevectors. The Gram matrix is always Hermitian and positivesemi-definite, and it has rankd.

1We work with complex vectors for complete generality. The adaptationsfor real vectors are transparent.

3

The positive-definite matrixXX ∗ is usually called theframeoperator. Since

v∗(XX ∗)v =N∑

n=1

|〈v,xn〉|2

we can rewrite (1) as

α ≤ v∗(XX ∗) v

v∗v≤ β. (2)

That is, any Rayleigh quotient ofXX ∗ must lie betweenα andβ. It follows from the Courant–Fischer Theorem [25] that eacheigenvalue of the frame operator lies in the interval[α, β].

When the frame isα-tight, the condition (2) is equivalent tothe statement thatXX ∗ = α Id. Three other characterizationsof an α-tight frame follow immediately.

Proposition 1: A d × N frameX is α-tight if and only ifit satisfies one (hence all) of the following conditions.

1) All d non-zero singular values ofX equal√

α.2) All d non-zero eigenvalues of the Gram matrixX ∗X

equalα.3) The rows ofα−1/2 X form an orthonormal family.These properties undergird our method for constructing tight

frames. It is now clear that the being a tight frame is aspectralrequirement on the matrixX .

C. Norms of Frame Vectors

Throughout this article, we shall denote the squared normsof the frame vectors as

cndef= ‖xn‖2

2 .

There is an intimate relationship between the tightness param-eter of anα-tight frame and the norms of its columns. Thecomputation is straightforward:

α d = Tr XX ∗ = Tr X ∗X =N∑

n=1

‖xn‖22 =

N∑n=1

cn. (3)

The notationTr (·) represents the matrix trace operator, whichsums the diagonal entries of its argument.

A related point is that one cannot construct a tight framewith an arbitrary set of column norms. According to the Schur–Horn Theorem, a Hermitian matrix can exist if and only if itsdiagonal entries majorize2 its eigenvalues [25], [26]. IfX isa d ×N tight frame, thed non-zero eigenvalues of its Grammatrix all equal

∑n cn/d. Meanwhile, the diagonal entries of

the Gram matrix arec1, . . . , cN . In this case, the majorizationcondition is equivalent to the system of inequalities

0 ≤ ck ≤1d

N∑n=1

cn for eachk = 1, . . . , N. (4)

It follows that a tight frame with squared column normsc1, . . . , cN exists if and only if (4) holds. For an arbitraryset of column norms, the frames that are “closest” to beingtight have been characterized in [7], [27].

2The literature equivocates about the direction of the majorization relation.We adopt the sense used by Horn and Johnson [25].

III. D ESIGN VIA ALTERNATING PROJECTIONS

This section begins with formal statements of two framedesign problems. Next we establish a connection with inverseeigenvalue problems. It becomes clear that an alternatingprojection algorithm offers a simple and natural approach togeneral inverse eigenvalue problems, including both framedesign problems. We then solve the matrix nearness prob-lems that arise when implementing the proposed algorithm.The section concludes with a discussion of the algorithm’sconvergence properties.

A. Structured Tight Frames

Define the collection ofd×N α-tight frames:

Xαdef= X ∈ Cd×N : XX ∗ = α Id. (5)

We fix the tightness parameterα for simplicity. It is easy toextend our methods to situations where the tightness is notpredetermined, and one can apply similar ideas to constructframes with prescribed upper and lower frame bounds, viz.the parametersα andβ in (1). It is worth noting thatXα isessentially the Stiefel manifold, which consists of all sets ofd orthonormal vectors inCN [28].

Let S denote a closed3 collection of d × N matricesthat possess some structural property. In the sequel, we shallexplore several different structural constraints that have arisenin electrical engineering contexts. Section IV considers tightframes with specified column norms, and Section VI showshow to construct tight frames whose individual vectors havea low peak-to-average-power ratio. Many other structuralproperties are possible.

Each constraint setS raises a basic question.

Problem 1: Find a matrix inS that is minimally distantfrom Xα with respect to a given norm.

If the two sets intersect, any solution to this problem isa structured tight frame. Otherwise, the problem requests astructured matrix that is “most nearly tight.” A symmetricproblem is to find a tight frame that is “most nearly structured.”

B. Structured Gram Matrices

If the structural constraints restrict the inner products be-tween frame vectors, it may be more natural to work withGram matrices. Define a collection that contains the Grammatrices of alld×N α-tight frames:

Gαdef= G ∈ CN×N : G = G∗ and

G has eigenvalues(α, . . . , α︸︷︷︸d

, 0, . . . , 0). (6)

The set Gα is essentially the Grassmannian manifold thatconsists ofd-dimensional subspaces ofCN [29]. One may alsoidentify the matrices inGα as rank-d orthogonal projectors,scaled byα. (An orthogonal projector can be defined as an

3We equipCd×N andCN×N with the topology induced by the Frobeniusnorm, which is identical with every other norm topology [25].

4

idempotent, Hermitian matrix. The rank of a projector equalsthe dimension of its range.)

Let H be a closed collection ofN × N Hermitian ma-trices that possess some structural property. In Section V,for example, we shall consider equiangular tight frames. TheGram matrices of these frames have off-diagonal entries withidentical moduli, and it is an important challenge to constructthem.

Once again, a fundamental question arises.

Problem 2: Find a matrix inGα that is minimally distantfrom H with respect to a given norm.

If the two sets intersect, any solution to this problem will liein the intersection. Otherwise, the problem requests a tightframe whose Gram matrix is “most nearly structured.” We donot mention the problem of producing a matrix inH that isnearest toGα because it is not generally possible to factor amatrix in H to obtain a frame with dimensionsd×N .

C. Inverse Eigenvalue Problems

We view Problems 1 and 2 asinverse eigenvalue problems(IEPs). As Chu explains in [30], an IEP is an inquiry aboutstructured matrices with prescribed spectral properties. Thesespectral properties may include restrictions on eigenvalues,eigenvectors, singular values or singular vectors. Accordingto Proposition 1, the defining characteristic of a tight frame isits spectrum, so frame design is an IEP.

In the study of IEPs, the two fundamental questions aresolvability and computability. The former problem is to findnecessary or sufficient conditions under which a given IEPhas a solution. The latter problem is how to produce a matrixthat has given spectral properties and simultaneously satisfiesa structural constraint. The solvability and computability ofsome classical IEPs have been studied extensively by thematrix analysis community, although many open problems stillremain. The articles [30], [31] survey this literature.

Although specific IEPs may require carefully tailored nu-merical methods, there are a few general tools available. Onescheme is the coordinate-free Newton method, which has beenexplored in [32]–[34]. Newton-type algorithms do not applyto most problems, and they only converge locally. Anothergeneral method is the projected gradient approach developedby Chu and Driessel in [35]. This technique involves numericalintegration of a matrix differential equation, which relies onadvanced software that may not be readily available. Anotherproblem with Newton methods and projected gradient methodsis that they may not handle repeated singular values well. Thisshortcoming makes them a poor candidate for constructingtight frames, which have only two distinct singular values.

This article concentrates on alternating projection, whichhas occasionally been used to solve inverse eigenvalue prob-lems (in [36] and [37], for example). But alternating projectionhas not been recognized as a potential method for solvingany type of inverse eigenvalue problem. The most generaltreatment of alternating projection in the IEP literature isprobably [38], but the authors do not offer a serious analysisof their algorithm’s behavior.

Here is the basic idea behind alternating projection. Weseek a point of intersection between the set of matrices thatsatisfy a structural constraint and the set of matrices thatsatisfy a spectral constraint. An alternating projection beginsat a matrix in the first set, from which it computes a matrixof minimal distance in the second set. Then the algorithmreverses the roles of the two sets and repeats the processadinfinitum. Alternating projection is easy to apply, and it isusually globally convergent in a weak sense, as we show later.

D. Alternating Projections

Let us continue with a formal presentation of the genericalternating projection method for solving inverse eigenvalueproblems. Suppose that we have two collections,Y andZ , of matrices with identical dimensions. Of course, weare imagining that one collection of matrices incorporates aspectral constraint while the other collection incorporates astructural constraint. To ensure that the algorithm is well-posed, assume that one collection is closed and the other iscompact.

Algorithm 1 (Alternating Projection):INPUT:• An (arbitrary) initial matrixY0 with appropriate dimen-

sions• The number of iterations,J

OUTPUT:• A matrix Y in Y and a matrixZ in Z

PROCEDURE:1) Initialize j = 0.2) Find a matrixZj in Z such that

Zj ∈ arg minZ∈Z

‖Z − Yj‖F .

We use‖·‖F to indicate the Frobenius norm.3) Find a matrixYj+1 in Y such that

Yj+1 ∈ arg minY∈Y

‖Y − Zj‖F .

4) Incrementj by one.5) Repeat Steps 2–4 untilj = J .6) Let Y = YJ andZ = ZJ−1.A solution to the optimization problem in Step 2 is called a

projectionof Yj ontoZ in analogy with the case whereZ is alinear subspace. Step 3 computes the projection ofZj ontoY .In a Hilbert space, it can be shown geometrically that a givenpoint has a unique projection onto each closed, convex set.Projections onto general sets may not be uniquely determined,which fiercely complicates the analysis of Algorithm 1.

von Neumann, in 1933, was the first to consider alternatingprojection methods. He showed that ifY andZ are closed,linear subspaces of a Hilbert space, then alternating projectionconverges to the point inY ∩ Z nearest toY0 [39]. In1959, Cheney and Goldstein demonstrated that alternatingprojection between two compact, convex subsets of a Hilbertspace always yields a point in one set at minimal distance fromthe opposite set [4]. These two results inspired the applicationof Algorithm 1 to the inverse eigenvalue problems, Problems1 and 2. Of course, the constraint sets that we consider are

5

generallynot convex. For a more extensive discussion of theliterature on alternating projection, turn to Appendix I-G.

To implement the alternating projection algorithm, one mustfirst solve the minimization problems in Steps 2 and 3. Forobvious reasons, these optimizations are called thematrixnearness problemsassociated withY and Z . Already thereis an extensive literature on matrix nearness problems. See,for example, the articles [40]–[42], the survey [43] and manysections of the book [25]. Section III-F exhibits solutions tothe nearness problems associated with the spectral constraintsXα andGα. Even when it is necessary to solve a new nearnessproblem, this task often reduces to an exercise in differentialcalculus. This is one of the great advantages of Algorithm 1.In this article, we shall always measure the distance betweenmatrices using the Frobenius norm‖·‖F because it facilitatesthe solution of matrix nearness problems. Of course, one coulddevelop a formally identical algorithm using other norms,metrics or divergences.

Since the constraint sets are generally non-convex, alternat-ing projection may not converge as well as one might wish.This explains why we have chosen to halt the algorithm aftera fixed number of steps instead of waiting for‖Yj − Yj+1‖Fto decline past a certain threshold. Indeed, it is theoreticallypossible that the sequence of iterates will not converge innorm. In practice, it appears that norm convergence alwaysoccurs. Section III-G provides a skeletal discussion of thetheoretical convergence of alternating projection. We do notflesh out the analysis until Appendices I and II because aproper treatment requires some uncommon mathematics.

E. Application to Problems 1 and 2

To solve Problem 1, we propose an alternating projectionbetween the structural constraint setS and the spectralconstraint setXα. Two matrix nearness problems arise. In thenext subsection, we demonstrate how to find a tight frame inXα nearest to an arbitrary matrix. Sections IV and VI containdetailed treatments of two different structural constraints.

To solve Problem 2, we alternate between the spectralconstraintGα and the structural constraintH . In the nextsubsection, we show how to produce a matrix inGα that isnearest to an arbitrary matrix. In Section V, we analyze aspecific structural constraintH . After performing the alter-nating projection, it may still be necessary to extract a frameX from the output Gram matrix. This is easily accomplishedwith a rank-revealing QR factorization or with an eigenvaluedecomposition. Refer to [44] for details.

F. Nearest Tight Frames

Standard tools of numerical linear algebra can be used tocalculate anα-tight frame that is closest to an arbitrary matrixin Frobenius norm.

Theorem 2:Let N ≥ d, and suppose that thed×N matrixZ has singular value decompositionUΣV ∗. With respect tothe Frobenius norm, a nearestα-tight frame toZ is givenby α UV ∗. Note that UV ∗ is the unitary part of a polarfactorization ofZ .

Assume in addition thatZ has full row-rank. Thenα UV ∗

is the uniqueα-tight frame closest toZ . Moreover, one maycomputeUV ∗ using the formula(ZZ∗)−1/2 Z .

Proof: The proof of this well-known result is similar tothat of Theorem 3, which appears below. See also pp. 431–432of [25]. Classical references on related problems include [45],[46]. The formula for the polar factor may be verified with adirect calculation.

It is also straightforward to compute a matrix inGα nearestto an arbitrary Hermitian matrix. This theorem appears to benovel, so we provide a short demonstration.

Theorem 3:Suppose thatZ is anN ×N Hermitian matrixwith a unitary factorizationUΛU∗, where the entries ofΛ arearranged in algebraically non-increasing order. LetUd be theN × d matrix formed from the firstd columns ofU. Thenα UdUd

∗ is a matrix inGα that is closest toZ with respect tothe Frobenius norm. This closest matrix is unique if and onlyif λd strictly exceedsλd+1.

Proof: We must minimize‖Z − α G‖F over all rank-d orthogonal projectorsG . Square and expand this objectivefunction:

‖Z − α G‖2F = ‖Z‖2

F + α2 ‖G‖2F − 2α Re TrG∗Z .

The squared Frobenius norm of an orthogonal projector equalsits rank, so we only need to maximize the (negation of) thelast term.

Every rank-d orthogonal projectorG can be written asG = VV ∗, where theN × d matrix V satisfiesV ∗V = Id.Meanwhile, we may factorZ into its eigenvalue decompositionUΛU∗, whereU is unitary andΛ is a diagonal matrix withnon-increasing, real entries. Using the properties of the traceoperator, we calculate that

Re TrG∗Z = Re Tr (VV ∗)(UΛU∗)= Re Tr (U∗VV ∗U) Λ

= ReN∑

n=1

(U∗VV ∗U)nn λn.

Observe thatU∗VV ∗U is a positive semi-definite matrixwhose eigenvalues do not exceed one. Therefore, the diagonalentries ofU∗VV ∗U are real numbers that lie between zeroand one inclusive. Moreover, these diagonal entries must sumto d because

TrU∗VV ∗U = ‖U∗V ‖2F = ‖V ‖2

F = d.

It follows that

maxG

Re TrG∗Z ≤d∑

n=1

λn.

The bound is met whenever the diagonal ofU∗VV ∗U containsd ones followed by(N − d) zeroes. This sufficient conditionfor attainment can be written

U∗VV ∗U = Id ⊕ 0N−d.

Furthermore, ifλd > λd+1, this condition is also necessary.Form a matrixUd by extracting the firstd columns ofU.

Then the sufficient condition holds wheneverG = VV ∗ =

6

UdUd∗. That is, G is the orthogonal projector onto anyd-

dimensional subspace spanned by eigenvectors correspondingto thed algebraically largest eigenvalues ofZ . If λd > λd+1,this subspace is uniquely determined. The orthogonal projectoronto a fixed subspace is always unique, and the uniquenessclaim follows.

It may be valuable to know that there are specialized algo-rithms for performing the calculations required by Theorems2 and 3. For example, Higham has developed stable numericalmethods for computing the polar factor of a matrix [47],[48] that are more efficient than computing a singular valuedecomposition or applying the formula(ZZ∗)−1/2 Z .

G. Basic Convergence Results

It should be clear that alternating projection never increasesthe distance between successive iterates. This does not meanthat it will locate a point of minimal distance between theconstraint sets. It can be shown, however, that Algorithm 1 isglobally convergent in a weak sense.

Define the distance between a pointM and a setY via

dist(M,Y ) = infY∈Y

‖Y −M‖F .

Theorem 4 (Global Convergence of Algorithm 1):Let Yand Z be closed sets, one of which is bounded. Supposethat alternating projection generates a sequence of iterates(Yj ,Zj). This sequence has at least one accumulation point.• Every accumulation point lies inY ×Z .• Every accumulation point(Y ,Z ) satisfies∥∥Y − Z

∥∥F

= limj→∞

‖Yj − Zj‖F .

• Every accumulation point(Y ,Z ) satisfies∥∥Y − Z∥∥

F= dist(Y ,Z ) = dist(Z ,Y ).

For a proof of Theorem 4, turn to Appendix II-A. In somespecial cases, it is possible to develop stronger convergenceresults and characterizations of the fixed points. We shallmention these results where they are relevant. The convergenceof Algorithm 1 is geometric at best [49]–[52]. This is the majorshortfall of alternating projection methods.

Note that the sequence of iterates may have many accu-mulation points. Moreover, the last condition does not implythat the accumulation point(Y ,Z ) is a fixed point of thealternating projection. So what are the potential accumulationpoints of a sequence of iterates? Since the algorithm neverincreases the distance between successive iterates, the set ofaccumulation points includes every pair of matrices inY ×Zthat lie at minimal distance from each other. Therefore, we saythat the algorithm tries to solve Problems 1 and 2.

IV. PRESCRIBEDCOLUMN NORMS

As a first illustration of alternating projection, let us con-sider the most basic frame design problem: How does onebuild a tight frame with prescribed column norms?

This question has arisen in the context of constructingoptimal signature sequences for direct-spread synchronouscode-division multiple-access (DS-CDMA) channels. There

are some finite algorithms available that yield a small numberof solutions to the problem [7], [18]. These methods exploitthe connection between frames and the Schur–Horn Theorem.They work by applying plane rotations to an initial tight frameto adjust its column norms while maintaining its tightness.Casazza and Leon have also developed a finite method thatseems different in spirit [17].

To construct larger collections of frames, some authorshave proposed iterative algorithms [19]–[23]. These techniquesattempt to minimize thetotal squared correlation(TSC) of aninitial matrix subject to constraints on its column norms. TheTSC of a matrix is defined as

TSC(S) def= ‖S∗S‖2F =

∑m,n

|〈sm, sn〉|2.

If we fix the squared column norms ofS to be c1, . . . , cN , ashort algebraic manipulation shows that minimizing the TSCis equivalent to solving

minS‖SS∗ − α Id‖F

where α =∑

n cn/d. In words, minimizing the TSC isequivalent to finding a frame with prescribed column normsthat is closest in Frobenius norm to a tight frame [53].

In comparison, alternating projection affords an elegant wayto produce many tight frames with specified column norms. Itfocuses on the essential property of a tight frame—its singularvalues—to solve the problem. In this special case, we providea complete accounting of the behavior of the alternatingprojection. Moreover, experiments show that it outperformssome of the other iterative algorithms that were developedspecifically for this problem.

A. Constraint Sets and Nearness Problems

The algorithm will alternate between the set of matriceswith fixed column norms and the set of tight frames with anappropriate tightness parameter.

Let the positive numbersc1, . . . , cN denote the squaredcolumn norms that are desired. We do not require that thesenumbers satisfy the majorization inequalities given in (4),although one cannot hope to find a tight frame if theseinequalities fail. In that case, we would seek a matrix with theprescribed column norms that is closest to being a tight frame.In the DS-CDMA application, the column norms depend onthe users’ power constraints [6], [7].

In light of (3), the relation between the tightness parameterand the column norms, it is clear thatα must equal

∑n cn/d.

The spectral constraint set becomes

Xαdef= X ∈ Cd×N : XX ∗ = (

∑n cn/d) Id.

Given an arbitaryd×N matrix, one may compute the closesttight frame inXα using Theorem 2.

The structural constraint set contains matrices with thecorrect column norms.

Sdef= S ∈ Cd×N : ‖sn‖2

2 = cn.

It is straightforward to solve the matrix nearness problemassociated with this collection.

7

Proposition 5: Let Z be an arbitrary matrix with columnszn. A matrix in S is closest toZ in Frobenius norm if andonly if it has the columns

sn =

cn zn/ ‖zn‖2 , zn 6= 0 andcn un, zn = 0,

whereun represents an arbitrary unit vector. If the columns ofZ are all non-zero, then the solution to the nearness problemis unique.

Proof: We must minimize‖S − Z‖F over all matricesS from S . Square and rewrite this objective function:

‖S − Z‖2F =

N∑n=1

‖sn − zn‖22 .

We can minimize each summand separately. Fix an indexn,and expand then-th term using‖sn‖2

2 = cn.

‖sn − zn‖22 = cn + ‖zn‖2

2 − 2√

cn Re⟨

sn

‖sn‖2

,zn

⟩.

If zn 6= 0, the unique maximizer ofRe 〈u,zn〉 over all unitvectors isu = zn/ ‖zn‖2. If zn = 0, then every unit vectoru maximizes the inner product.

B. Convergence Results

In this setting, alternating projection converges in a fairlystrong sense.

Theorem 6:Let S0 have full rank and non-zero columns,and suppose that the alternating projection generates a se-quence of iterates(Sj ,Xj). This sequence possesses at leastone accumulation point, say(S ,X ).• Both S andX have full rank and non-zero columns.• The pair(S ,X ) is a fixed point of the alternating projec-

tion. In other words, if we applied the algorithm toS orto X every pair of iterates would equal(S ,X ).

• The accumulation point satisfies∥∥S − X∥∥

F= lim

j→∞‖Sj − Xj‖F .

• The component sequences are asymptotically regular, i.e.

‖Sj+1 − Sj‖F → 0 and ‖Xj+1 − Xj‖F → 0.

• Either the component sequences both converge in norm,∥∥Sj − S∥∥

F→ 0 and

∥∥Xj − X∥∥

F→ 0,

or the set of accumulation points forms a continuum.Proof: See Appendix II-C.

In the present case, it is also possible to characterizecompletely the fixed points of the algorithm that lie inS .

Proposition 7: A full-rank matrixS from S is a fixed pointof the alternating projection betweenS andXα if and only ifits columns are all eigenvectors ofSS∗. That is,SS∗S = SΛ,whereΛ ∈ CN×N is diagonal and positive with no more thand distinct entries.

Proof: Refer to Appendix II-D.Many of the fixed points inS do not lie at minimal

distance fromXα, so they are not solutions to Problem 1.Nevertheless, the fixed points still have a tremendous amount

of structure. Each fixed point can be written as a union of tightframes for mutually orthogonal subspaces ofCd, and the setof fixed points is identical with the set of critical points of theTSC functional subject to the column norm constraint [23],[53]. The Ulukus-Yates algorithm, another iterative methodfor designing tight frames with specified column norms, hasidentical fixed points [20].

C. Numerical Examples

We offer a few simple examples to illustrate that thealgorithm succeeds, and we provide some comparisons withthe Ulukus-Yates algorithm.

Suppose first that we wish to construct a unit-norm tightframe forR3 consisting of five vectors. Initialized with a3×5matrix whose columns are chosen uniformly at random fromthe surface of the unit sphere, the algorithm returns

S =

24 0.1519 0.4258 −0.7778 0.0160 −0.92580.9840 −0.6775 0.1882 0.3355 −0.3024

−0.0926 0.5998 0.5997 0.9419 −0.2269

35 .

Each column norm of the displayed matrix equals one tomachine precision, and the singular values are identical in theirfirst eight digits. In all the numerical examples, the algorithmwas terminated on the condition that‖Sj+1 − Sj‖F < 10−8.Implemented in Matlab, the computation took 65 iterations,which lasted 0.0293 seconds on a 1.6 GHz machine.

Now let us construct a tight frame forR3 whose fivevectors have norms 0.75, 0.75, 1, 1.25 and 1.25. With randominitialization, we obtain

S =

24 −0.1223 0.1753 −0.7261 0.0128 −1.08480.7045 −0.6786 0.6373 0.0972 −0.6145

−0.2263 0.2670 0.2581 1.2461 −0.0894

35 .

The column norms are correct to machine precision, and thesingular values are identical to seven digits. The computationtook 100 iterations, which lasted 0.0487 seconds.

Next we examine a case where the column norms do notsatisfy the majorization condition. Suppose that we seek a“nearly tight” frame with column norms 0.5, 0.5, 1, 1 and2. Random initialization yields

S =

24 −0.1430 0.1353 −0.4351 −0.0941 −1.80050.4293 −0.4213 0.7970 −0.2453 −0.7857

−0.2127 0.2329 0.4189 0.9649 −0.3754

35.

The column norms are all correct, but, as predicted, the frameis not tight. Nevertheless, the last vector is orthogonal to thefirst four vectors, which form a tight frame for their span.This is an optimal solution to the frame design problem. Thecalculation required 34 iterations over 0.0162 seconds.

Of course, alternating projection can produce complex-valued tight frames, as well as larger frames in higher-dimensional spaces. Such ensembles are too large to displayin these columns. To give a taste of the algorithm’s generalperformance, we have compared it with our implementationof the Ulukus-Yates algorithm [20]. To construct unit-norm

8

60 80 100 120 140 160 180 2000

5

10

15

20

25

Number of Vectors (N)

Exe

cutio

n T

ime

(sec

)Comparative Execution Times in Real Dimension d =64

Alternating ProjectionsUlukus−Yates algorithm

Fig. 1. Comparison of alternating projection with the Ulukus-Yates algorithmin 64 real dimensions.

60 80 100 120 140 160 180 2000

10

20

30

40

50

60

70

Number of Vectors (N)

Exe

cutio

n T

ime

(sec

)

Comparative Execution Times in Complex Dimension d =64

Alternating ProjectionsUlukus−Yates algorithm

Fig. 2. Comparison of alternating projection with the Ulukus-Yates algorithmin 64 complex dimensions.

tight frames of various lengths, we initialized each algorithmwith the same random matrix. Then we plotted the compar-ative execution times. Figure 1 shows the results for 64 realdimensions, and Figure 2 shows the results for 64 complexdimensions. Note the different scales on the time axes.

Both algorithms perform slowly whenN is small becausetight frames are relatively scarce, which makes them difficultto find. Indeed, it is known that (modulo rotations) there existsa unique tight frame of(d+1) vectors ind-dimensional space[54]. Another reason that the alternating projection algorithmperforms better as the problem grows is that a collectionof N uniformly random unit-vectors converges almost surelyto a tight frame asN tends to infinity [55]. It is thereforeperplexing that the Ulukus-Yates algorithm performs more andmore slowly. One might attribute this behavior to the fact thatthe algorithm does not act to equalize the singular values of

the frame.

V. EQUIANGULAR TIGHT FRAMES

In this section, we shall consider a frame design problemthat leads to a simple structural constraint on the Gram matrix.The goal of the alternating projection will be to design asuitable Gram matrix, from which the frame may be extractedafterward.

A tight frame is a generalization of an orthonormal basisbecause they share the Parseval property. But orthonormalbases have other characteristics that one may wish to extend.In particular, every orthonormal basis isequiangular. Thatis, each pair of distinct vectors has the same inner product,namely zero. This observation suggests that one seek outequiangular tight frames. The underlying intuition is that theseframes will contain vectors maximally separated in space.

Define anequiangular tight frameto be a unit-norm tightframe in which each pair of columns has the sameabsolutein-ner product. Since we are considering unit-norm tight frames,the absolute inner product between two frame vectors equalsthe cosine of the acute angle between the one-dimensionalsubspaces spanned by the two vectors. For this reason arethe frames called equiangular. One can show that each innerproduct in an equiangular tight frame has modulus

µdef=

√N − d

d (N − 1). (7)

It is a remarkable fact thateveryensemble ofN unit vectorsin d dimensions contains a pair whose absolute inner productstrictly exceedsµ, unlessthe vectors form an equiangular tightframe. Unfortunately, equiangular tight frames only exist forrare combinations ofd andN . In particular, a real, equiangulartight frame can exist only ifN ≤ 1

2 d (d + 1), while acomplex, equiangular tight frame requires thatN ≤ d2 [12].The paper [56] contains detailed necessary conditions on real,equiangular tight frames and on equiangular tight frames overfinite alphabets.

One can view equiangular tight frames as a special typeof Grassmannian frame. In finite dimensions, Grassmannianframes are unit-norm frames whose largest absolute innerproduct is minimal for a givend and N [12]. Their nameis motivated by the fact that they correspond with spherepackings in the Grassmannian manifold of all one-dimensionalsubspaces of a vector space [29]. Grassmannian frames haveapplications in coding theory and communications engineering[9]–[12]. They also provide a natural set of vectors to use forsparse approximation [13]–[15].

In general, it is torturous to design Grassmannian frames.Not only is the optimization difficult, but there is no generalprocedure for deciding when a frame solves the optimizationproblem unless it meets a known lower bound. Most of thecurrent research has approached the design problem withalgebraic tools. A notable triumph of this type is the con-struction of Kerdock codes overZ2 andZ4 due to Calderbanket al. [57]. Other explicit constructions are discussed in thearticles [10], [12]. In the numerical realm, Sloane has usedhis Gosset software to produce and study sphere packings in

9

real Grassmannian spaces [58]. Sloane’s algorithms have beenextended to complex Grassmannian spaces in [59]. We are notaware of any other numerical methods.

In this article, we shall construct equiangular tight framesfor real and complex vector spaces using alternating projec-tion. The method can easily be extended to compute otherfinite Grassmannian frames and packings in higher Grassman-nian manifolds, but that is another paper for another day [60].

A. Constraint Sets and Nearness Problems

The signal of an equiangular tight frame is that each innerproduct between distinct vectors has the same modulus. Sincethe Gram matrix of a tight frame displays all of the innerproducts, it is more natural to construct the Gram matrix of anequiangular tight frame than to construct the frame synthesismatrix directly. Therefore, the algorithm will alternate betweenthe collection of Hermitian matrices that have the correctspectrum and the collection of Hermitian matrices that havesufficiently small off-diagonal entries.

Since we are working with unit-norm tight frames, thetightness parameterα must equalN/d. This leads to thespectral constraint set

Gαdef= G ∈ CN×N : G = G∗ and

G has eigenvalues(N/d, . . . , N/d︸︷︷︸d

, 0, . . . , 0).

Theorem 3 shows how to find a matrix inGα nearest to anarbitrary Hermitian matrix.

In an equiangular tight frame, each vector has unit norm butno two vectors have inner product larger thanµ. Therefore,we define the structural constraint set

Hµdef= H ∈ CN×N : H = H∗,

diag H = 1 and maxm6=n

|hmn| ≤ µ.

It may seem more natural to require that the off-diagonalentries have modulus exactly equal toµ, but our experienceindicates that the present formulation works better, perhapsbecauseHµ is convex. The following proposition shows howto produce the nearest matrix inHµ.

Proposition 8: Let Z be an arbitrary matrix. With respectto Frobenius norm, the unique matrix inHµ closest toZ hasa unit diagonal and off-diagonal entries that satisfy

hmn =

zmn if |zmn| ≤ µ andµ ei arg zmn otherwise.

We usei to denote the imaginary unit.Proof: The argument is straightforward.

B. Convergence Results

The general convergence result, Theorem 4, applies to thealternating projection betweenGα andHµ. We also obtain alocal convergence result.

Theorem 9:Assume that the alternating projection betweenGα and Hµ generates a sequence of iterates(Gj ,Hj),and suppose that there is an iterationJ during which

‖GJ − HJ‖F < N/(d√

2). The sequence of iterates possessesat least one accumulation point, say(G ,H).• The accumulation point lies inGα ×Hµ.• The pair (G ,H) is a fixed point of the alternating pro-

jection. In other words, if we applied the algorithm toGor to H, every iterate would equal(G ,H).

• The accumulation point satisfies∥∥G − H∥∥

F= lim

j→∞‖Gj − Hj‖F .


‖Gj+1 − Gj‖F → 0 and ‖Hj+1 − Hj‖F → 0.

• Either the component sequences both converge in norm,∥∥Gj − G∥∥

F→ 0 and

∥∥Hj − H∥∥

F→ 0,

or the set of accumulation points forms a continuum.Proof: See Appendix II-B.


First, let us illustrate just how significant a difference thereis between vanilla tight frames and equiangular tight frames.Here is the Gram matrix of a six-vector, unit-norm tight framefor R3:

26666641.0000 0.2414 −0.6303 0.5402 −0.3564 −0.35430.2414 1.0000 −0.5575 −0.4578 0.5807 −0.2902

−0.6303 −0.5575 1.0000 0.2947 0.3521 −0.28470.5402 −0.4578 0.2947 1.0000 −0.2392 −0.5954

−0.3564 0.5807 0.3521 −0.2392 1.0000 −0.5955−0.3543 −0.2902 −0.2847 −0.5954 −0.5955 1.0000

3777775.

Notice that the inner-products between vectors are quite dis-parate, ranging in magnitude between 0.2392 and 0.6303.These inner products correspond to acute angles of76.2 and50.9. In fact, this tight frame is pretty tame; the largest innerproducts in a unit-norm tight frame can be arbitrarily closeto one4. The Gram matrix of a six-vector, equiangular tightframe forR3 looks quite different:

26666641.0000 0.4472 −0.4472 0.4472 −0.4472 −0.44720.4472 1.0000 −0.4472 −0.4472 0.4472 −0.4472

−0.4472 −0.4472 1.0000 0.4472 0.4472 −0.44720.4472 −0.4472 0.4472 1.0000 −0.4472 −0.4472

−0.4472 0.4472 0.4472 −0.4472 1.0000 −0.4472−0.4472 −0.4472 −0.4472 −0.4472 −0.4472 1.0000

3777775.

Every pair of vectors meets at an acute angle of63.4. Thevectors in this frame can be interpreted as the diagonals of anicosahedron [29].

We have used alternating projection to compute equiangu-lar tight frames, both real and complex, in dimensions twothrough six. The algorithm performed poorly when initializedwith random vectors, which led us to adopt a more sophis-ticated approach. We begin with many random vectors andwinnow this collection down by repeatedly removing whatevervector has the largest inner product against another vector. Itis fast and easy to design starting points in this manner, yet

10

dN 2 3 4 5 6

3 R R .. .. ..4 C R R .. ..5 .. . R R ..6 .. R . R R7 .. C C . R8 .. . C . .9 .. C . . C10 .. .. . R .11 .. .. . C C12 .. .. . . C13 .. .. C . .14 .. .. . . .15 .. .. . . .16 .. .. C . R17 .. .. .. . .18 .. .. .. . .19 .. .. .. . .

dN 2 3 4 5 6

20 .. .. .. . .21 .. .. .. C .22 .. .. .. . .23 .. .. .. . .24 .. .. .. . .25 .. .. .. C .26 .. .. .. .. .27 .. .. .. .. .28 .. .. .. .. .29 .. .. .. .. .30 .. .. .. .. .31 .. .. .. .. C32 .. .. .. .. .33 .. .. .. .. .34 .. .. .. .. .35 .. .. .. .. .36 .. .. .. .. C

TABLE I

EQUIANGULAR TIGHT FRAMES

The notationsR and C respectively indicate that alternating projection wasable to compute a real, or complex, equiangular tight frame. Note that everyreal, equiangular tight frame is automatically a complex, equiangular tightframe. One period (.) means that no real, equiangular tight frame exists, andtwo periods (..) mean that no equiangular tight frame exists at all.

the results are impressive. These calculations are summarizedin Table I.

Alternating projection can locate every real, equiangulartight frame in dimensions two through six; algebraic con-siderations eliminate all the remaining values ofN [12],[56]. Moreover, the method computes these ensembles veryefficiently. For example, the algorithm produced a six-vector,equiangular tight frame forR3 after a single trial. In thiscase, 70 iterations lasting 0.4573 seconds were sufficient todetermine the first eight decimal places of the inner products.

In the complex case, the algorithm was able to computeevery equiangular tight frame that we know of. Unfortunately,no one has yet developed necessary conditions on the existenceof complex, equiangular tight frames aside from the upperbound, N ≤ d2, and so we have been unable to rule outthe existence of other ensembles. Some of the computationsprogressed quite smoothly. After 1000 iterations and 18.75seconds, alternating projection delivered a collection of 25vectors in five dimensions whose inner products were identicalin the first eight decimal places. On the other hand, it took5000 iterations and 85.75 seconds to produce 21 vectors infive dimensions whose inner products reached the same levelof accuracy. Even worse, we were unable to locate the 31-vector equiangular tight frame inC6 until we had performedtwo dozen random trials that lasted several minutes each. Itis some consolation that the authors of [59] indicate theiralgorithm could not compute this ensemble at all.

It seems clear that some equiangular tight frames are mucheasier to find than others. We have encountered less successat constructing equiangular tight frames in higher dimensions.But we have neither performed extensive experiments nor have

4To see this, consider a tight frame that contains two copies of anorthonormal basis, where one copy is rotated away from the other by anarbitrarily small angle.

we attempted to fine-tune the method.

VI. PEAK-TO-AVERAGE-POWER RATIO

Finally, let us present a situation in which the matrixnearness problem is much more difficult.

As we have mentioned, tight frames with prescribed vectornorms coincide with signature sequences that maximize sumcapacity in the uplink of direct-spread, synchronous codedivision multiple access (DS-CDMA) systems [6]–[8]. Un-fortunately, general tight frames can have properties that areundesirable in practice. In particular, the individual framevectors may have largepeak-to-average-power ratio(PAR).

The PAR of an analog signal measures how the largest valueof the signal compares with its average power. Signals withlarge PAR require higher dynamic range on the analog-to-digital converters and the digital-to-analog converters. Theymay also require more linear (and thus higher cost) poweramplifiers. In DS-CDMA systems, the PAR is normally ofconcern only in the downlink (see e.g. [61]), where linearcombinations of signatures can conspire to have tremendouspeak power. On the uplink, the PAR problem is fundamentallydifferent because it only involves individual signatures. Con-ventionally, the uplink PAR has not received attention becausemost systems use binary spreading sequences, which alwayshave unit PAR. If general sum-capacity-optimal sequences areto be used in real systems, then PAR side constraints should beincluded in the design problem. Therefore, we shall considerhow to construct tight frames whose columns have prescribednorms and low peak-to-average-power ratios.

As discussed in Section IV, many algorithms have alreadybeen developed for constructing tight frames with prescribedvector norms, such as [7], [19], [20], [22]. Unfortunately, thesemethods cannot accept additional constraints on the vectors,and thus they are not suitable for finding tight frames whosevectors have low PAR. We show that alternating projectionprovides a way to produce these ensembles. The PAR problemmakes an interesting test case because it induces a matrixnearness problem that is considerably more challenging thanthose we have examined in previous sections.

A. Constraint Sets and Matrix Nearness Problems

The PAR in a digital communication system is funda-mentally related to the analog waveforms that are generated.From the perspective of sequence design, it usually suffices toconsider the PAR defined directly from the discrete sequence.The discrete PAR of a vectorz is the quantity

PAR(z) def=maxm |zm|2∑

m |zm|2/d.

Note that1 ≤ PAR(z) ≤ d. The lower extreme corresponds toa vector whose entries have identical modulus, while the upperbound is attained only by (scaled) canonical basis vectors.

Suppose that we require the columns of the frame to havesquared normsc1, . . . , cN . In the DS-CDMA application, thesenumbers depend on the users’ power constraints [6], [7]. It

11

follows from (3) thatα =∑

n cn/d. The spectral constraintset becomes

Xαdef= X ∈ Cd,N : XX ∗ = (

∑n cn/d) Id.

Theorem 2 delivers the solution to the associated matrixnearness problem.

Let ρ denote the upper bound on the PAR of the frameelements. Then the structural constraint set will be

Sdef= S ∈ Cd×N : PAR(sn) ≤ ρ and ‖sn‖2

2 = cn.

Given an arbitrary matrixZ , we must compute the nearestelement ofS . Since the structural constraint on each columnis independent and the Frobenius norm is separable, eachcolumn yields an independent optimization problem. For eachcolumn zn of the input matrix, we claim that the followingalgorithm returnssn, the corresponding column of a nearestmatrix S from S .

Algorithm 2 (Nearest Vector with Low PAR):INPUT:

• An input vectorz from Cd

• A positive numberc, the squared norm of the solutionvector

• A number ρ from [1, d], which equals the maximumpermissible PAR

OUTPUT:

• A vector s from Cd that solves

mins‖s− z‖2 subj. to PAR(s) ≤ ρ and ‖s‖2

2 = c.

PROCEDURE:

1) Scalez to have unit norm; defineδ =√

c ρ/d; andinitialize k = 0.

2) Let M index (d− k) components ofz with least mag-nitude. If this set is not uniquely determined, incrementk and repeat Step 2.

3) If zm = 0 for eachm in M , a solution vector is

s =

√c−k δ2

d−k for m ∈ M , and

δ ei arg zm for m /∈ M .

4) Otherwise, let

γ =

√c− k δ2∑m∈M |zm|2

.

5) If γ zm > δ for anym in M , incrementk and return toStep 2.

6) The unique solution vector is

s =

γ zm for m ∈ M , andδ ei arg zm for m /∈ M .

When ρ = 1, the output of the algorithm is a unimodularvector whose entries have the same phase as the correspondingentries ofz. On the other hand, whenρ = d, the output vectorequalsz. We now prove that the algorithm is correct.

Proof: We must solve the optimization problem

mins‖s− z‖2

2 subject to PAR(s) ≤ ρ and ‖s‖22 = c.

Let us begin with some major simplifications. First, rewritethe PAR constraint by enforcing the norm requirement andrearranging to obtain the equivalent condition

maxm

|sm| ≤√

c ρ/d.

In the rest of the argument, the symbolδ will abbreviate thequantity

√c ρ/d. The PAR constraint becomes|sm| ≤ δ for

eachm = 1, . . . , d.Now expand the objective function and enforce the norm

constraint again to obtain

mins

[c− 2 Re 〈s,z〉+ ‖z‖2

2

].

Observe that it is necessary and sufficient to minimize thesecond term. It follows that the optimizer does not depend onthe scale of the input vectorz. So take‖z‖2 = 1 without lossof generality.

Next observe that the PAR constraint and the norm con-straint do not depend on the phases of the components ins. Therefore, the components of an optimals must have thesame phases as the components of the input vectorz. Inconsequence, we may assume that boths and z are non-negative real vectors.

We have reached a much more straightforward optimizationproblem. Given a non-negative vectorz with unit norm, wemust solve

maxs

〈s,z〉 subject to 〈s, s〉 = c and0 ≤ sm ≤ δ,

Observe that every point of the feasible set is a regular point,i.e. the gradients of the constraints are linearly independent.Therefore, Karush-Kuhn-Tucker (KKT) theory will furnishnecessary conditions on an optimizer [62].

We form the Lagrangian function

L(s, λ,µ,ν) = −〈s,z〉+ 12 λ (〈s, s〉 − c)

− 〈s,µ〉+ 〈s− δ 1,ν〉 .

The Lagrange multipliersµ and ν are non-negative becausethey correspond to the lower and upper bounds ons. Mean-while, the multiplierλ is unrestricted because it is associatedwith the equality constraint.

The first-order KKT necessary condition on a regular localmaximums? is that

0 = (∇s L)(s?, λ?,µ?,ν?)= −z + λ? s? − µ? + ν?,

(8)

where µ?m > 0 only if s?

m = 0 and ν?m > 0 only if

s?m = δ. Notice that one ofµ?

m or ν?m must be zero because

they correspond to mutually exclusive constraints. The second-order KKT necessary condition on a regular local maximumis that

0 ≤ yT (∇2s L)(s?, λ?,µ?,ν?) y

= λ? yT y

for every vectory in the subspace of first-order feasiblevariations. This subspace is non-trivial, soλ? ≥ 0.

Solve Equation (8) to obtain

λ?s? = z + µ? − ν?.

12

Wheneverµ?m > 0, both s?

m = 0 and ν?m = 0. This

combination is impossible becausezm ≥ 0. Therefore, wemay eliminateµ? to reach

λ? s? = z − ν?.

The casesλ? = 0 andλ? > 0 require separate consideration.If λ? = 0, it is clear thatν? = z. Sinceν?

m > 0 only ifs?

m = δ, we must haves?m = δ wheneverzm > 0. Suppose

that k components ofs? equal δ. The remaining(d − k)components are not uniquely determined by the optimizationproblem. From the many solutions, we choose one such that

s?m =

√c− k δ2

d− kfor m wherezm = 0.

This formula ensures thats? has the correct norm and thatnone of its entries exceedsδ.

Whenλ? > 0, the solution has the form

s? = [γ z]δ ,

where γ is positive and the operator[·]δ truncates toδcomponents of its argument that exceedδ. It is clear that thelargest components ofz are all truncated at the same time.We only need to determinewhich components these are.

To that end, observe thatγ 7→ ‖[γ z]δ‖2 is a strictlyincreasing function on[0, δ/zmin], where zmin is the leastpositive component ofz. For at most one value ofγ, therefore,does the vector[γ z]δ have norm

√c. If this norm value were

not attained, thenλ? would equal zero. Letk be the numberof entries ofs? that equalδ, and suppose thatM indexes theremaining(d− k) components. Then

c = ‖s?‖22 = k δ2 + γ2

∑m∈M

|zm|2.

Recall thatγ is positive. Therefore, is impossible thatk δ2 > c.When k δ2 = c, it follows that zm = 0 for eachm in M .Otherwise,zm must be non-zero for somem in M . Then thevalue ofγ must be

γ =

√c− k δ2∑m∈M |zm|2

.

B. Convergence

For the alternating projection between the PAR constraintset and the set ofα-tight frames, we have not proven a moreelaborate convergence theorem than the basic result, Theorem4, because it is not easy to guarantee that the solution to thePAR matrix nearness problem is unique. We have been ableto provide a sufficient condition on the fixed points of theiteration that lie in the PAR constraint setS . Note that similarfixed points arose in Section IV.

Theorem 10:A sufficient condition for a full-rank matrixS from S to be a fixed point of the alternating projection be-tweenS andXα is that the columns ofS are all eigenvectorsof SS∗. That is,SS∗S = SΛ, whereΛ ∈ CN×N is diagonaland positive, with no more thand distinct entries.

Proof: Refer to Appendix II-E.


Let us demonstrate that alternating projection can indeedproduce tight frames whose columns have specified PARand specified norm. We shall produce complex tight framesbecause, in the real case, PAR constraints can lead to a discreteoptimization problem. The experiments all begin with theinitial 3× 6 matrix

24 .0748 + .3609i .0392 + .4558i .5648 + .3635i.5861− .0570i −.2029 + .8024i −.5240 + .4759i

−.7112 + .1076i −.2622− .1921i −.1662 + .1416i

−.2567 + .4463i .7064 + .6193i .1586 + .6825i−.1806− .1015i −.1946− .1889i .5080 + .0226i

.0202 + .8316i .0393− .2060i .2819 + .4135i

35.

The respective PAR values of its columns are 1.5521, 2.0551,1.5034, 2.0760, 2.6475 and 1.4730.

Unit-PAR tight frames are probably the most interestingexample. In each column of a unit-PAR tight frame, the entriesshare an identical modulus, which depends on the norm of thecolumn. Let us apply our algorithm to calculate a unit-PAR,unit-norm tight frame:

24 .1345 + .5615i .1672 + .5526i .4439 + .3692i.5410− .2017i −.0303 + .5766i −.5115 + .2679i

−.5768 + .0252i −.2777− .5062i −.2303 + .5294i

−.3358 + .4696i .4737 + .3300i .0944 + .5696i−.5432− .1956i −.3689− .4442i .5747 + .0554i

.1258 + .5635i −.0088− .5773i .4132 + .4033i

35.

Indeed, each of the columns has unit PAR and unit norm. Thesingular values of the matrix are identical to eight decimalplaces. The calculation required 78 iterations lasting 0.1902seconds.

Alternating projection can also compute tight frames whosecolumns have unit PAR but different norms. For example, ifwe request the column norms 0.75, 0.75, 1, 1, 1.25 and 1.25,the algorithm yields

24 .3054 + .3070i .1445 + .4082i .3583 + .4527i.4295− .0549i .1235 + .4150i −.5597 + .1418i

−.4228− .0936i −.0484− .4303i .0200 + .5770i

−.4264 + .3893i .4252 + .5831i .3622 + .6242i−.5393− .2060i −.4425− .5701i .7165− .0863i

.2585 + .5162i −.2894− .6611i .1291 + .7101i

35.

One can check that the column norms, PAR and singularvalues all satisfy the design requirements to eight or moredecimal places. The computation took 84 iterations over0.1973 seconds.

Less stringent constraints on the PAR pose even less trouble.For example, we might like to construct a tight frame whosePAR is bounded by two and whose columns have norms 0.75,0.75, 1, 1, 1.25 and 1.25. It is

24 .0617 + .1320i .0184 + .2764i .4299 + .3593i.4256− .1031i −.0558 + .5938i −.5920 + .4974i

−.5912 + .0025i −.1304− .3363i −.0807 + .2857i

−.1382 + .2511i .6847 + .7436i .2933 + .6939i−.4306− .2650i −.2095− .3072i .7317 + .0928i

.0852 + .8093i −.3504− .5289i .2918 + .6048i

35.

13

The computer worked for 0.0886 seconds, during which itperformed 49 iterations. As usual, the singular values match toeight decimal places. It is interesting to observe that the frameexceeds the design specifications. The respective PAR valuesof its columns are 1.8640, 1.8971, 1.7939, 1.9867, 1.9618 and1.0897.

VII. D ISCUSSION

As advertised, we have developed an alternating projectionmethod for solving frame design problems, and we haveprovided ample evidence that it succeeds. In this section,we discuss some implementation issues and some of thelimitations of the algorithm. We conclude with a collectionof related problems that one can also solve with alternatingprojection.

A. The Starting Point

For alternating projection to succeed, it is essential to choosea good starting point. Here are a few general strategies thatmay be useful.

The simplest method is to selectN vectors uniformly atrandom from the surface of the unit sphere inCd and formthem into an initial matrix. Although this technique sometimesworks, it is highly probable that there will be pairs of stronglycorrelated vectors, and it is usually preferable for the frameto contain dissimilar vectors. Nevertheless, a collection ofrandom vectors converges almost surely to a tight frame asmore vectors are added [55].

A more practical idea is to select many vectors, say2 dN ,and then use a clustering algorithm—such as Lloyd-Max [63],sphericalk-means [64] or diametrical clustering [65]—to sep-arate these vectors intoN clusters. The cluster representativeswill usually be much more diverse than vectors chosen atrandom. A related approach would select many random vectorsand then greedily remove vectors that are highly correlatedwith the remaining vectors. This method seems to furnishexcellent starting points for constructing equiangular tightframes. One might also build up a collection of random vectorsby allowing a new vector to join only if it is weakly correlatedwith the current members.

Another technique is to start with a tight frame that hasbeen developed for another application. By rotating the frameat random, it is possible to obtain many different startingpoints that retain some of the qualities of the original frame. Inparticular, equiangular tight frames make excellent initializers.

It is also possible to choose a collection ofN vectors froma larger frame forCd. Similarly, one might truncate somecoordinates from a frame in a higher-dimensional space. Inparticular, one might truncate an orthonormal basis forCN toretain onlyd coordinates. See [66], for example, which usesthe Fourier transform matrix in this manner.

B. Limitations

Alternating projection is not a panacea that can alleviateall the pain of frame design. While preparing this report, weencountered several difficulties.

A theoretical irritation is the lack of a proof that alternatingprojection converges in norm. No general proof is possible,as the counterexample in [67] makes clear. Nevertheless, itwould be comforting to develop sufficient conditions thatguarantee the convergence of alternating projections betweennon-convex sets. The results of [67] are the best that we knowof. We would also like to develop conditions that can ensureconvergence to a pair of points at minimal distance. Here, themost general results are probably due to Csiszar and Tusnady[68].

Another major inconvenience is that alternating projectionconverges at a geometric rate (or worse) [49]–[52]. For largeproblems, it can be painful to wait on the solution. A valuabletopic for future research would be a method of acceleration.

A more specific disappointment was the inability of alter-nating projection to construct tight frames over small finitealphabets. It is straightforward to solve the matrix nearnessproblem associated with a finite alphabet, and it can be shownthat the algorithm always converges in norm to a fixed point.But the algorithm never once yielded a tight frame. This failureis hardly surprising; discrete constraints are some of the mostdifficult to deal with in optimization. It may be possible to useannealing to improve the performance of the algorithm. Thiswould be a valuable topic for future research.

C. Related Problems

We have permitted a great deal of freedom in the selectionof the structural constraint set, but we only considered thespectral constraints that arise naturally in connection withtight frames. Nevertheless, alternating projection offers astraightforward method for addressing other inverse eigenvalueproblems. For example, one might try to construct generalframes with prescribed lower and upper frame bounds,αand β. Instead of forcing the Gram matrix to be a rank-dorthogonal projector, one might impose only a rank constraintor a constraint on its condition number. To implement thealgorithm, it would only be necessary to solve the matrixnearness problem associated with these spectral constraints.

One can also use alternating projection to construct positivesemi-definite (PSD) matrices that have certain structural prop-erties. Higham, for example, has used a corrected alternatingprojection to produce the correlation matrix nearest to an inputmatrix [37]. (A correlation matrix is a PSD matrix with aunit diagonal.) Since the PSD matrices form a closed, convexset, it is possible to prove much more about the behavior ofalternating algorithms.

We have also had good success using alternating projectionto compute sphere packings in real and complex projectivespaces. These methods can be extended to produce spherepackings in real and complex Grassmannian manifolds [60]. Itseems clear that alternating projection has a promising futurefor a new generation of problems.

ACKNOWLEDGMENTS

I. S. Dhillon acknowledges support by NSF CAREERAward No. ACI-0093404. R. W. Heath acknowledges supportfrom the Texas Advanced Technology Program under Grant

14

No. 003658-0614-2001, the Samsung Advanced Institute ofTechnology, and the National Instruments Foundation. T.Strohmer acknowledges partial support by NSF grant DMS0208568.

APPENDIX IPOINT-TO-SET MAPS

To understand the convergence of the algorithms, we relyon some basic results from the theory of point-to-set maps.Zangwill’s book [69] is a good basic reference with applica-tions to mathematical programming. More advanced surveysinclude [70], [71]. de Leeuw presents statistical applicationsin [51]. We have drawn from all these sources here.

A. Point-to-Set Maps

Let Y andZ be arbitrary sets. Thepower setof Z is thecollection of all subsets ofZ , and it is denoted by2Z . Apoint-to-set mapfrom Y to Z is a functionΩ : Y → 2Z .In words,Ω maps each point ofY to a subset ofZ .

There are several different ways of combining point-to-setmaps. Take two mapsΩyz : Y → 2Z andΩzw : Z → 2W .The composition of these maps carries a pointy to a subsetof W via the rule

(Ωzw Ωyz)(y) =⋃

z∈Ωyz(y)

Ωzw(z).

This definition can be extended in the obvious way to a longercomposition of maps. Now, supposeΩuv mapsU to 2V . TheCartesian product ofΩuv andΩyz is the point-to-set map fromU × Y to V ×Z given by

(Ωuv × Ωyz)(u, y) = Ωuv(u)× Ωyz(y).

B. Topological Properties

Suppose that the underlying sets are endowed with topolo-gies so that we may speak of convergence. A mapΩ : Y →2Z is closed at the pointy in Y whenever the statementsyj → y, zj ∈ Ω(yj) andzj → z together imply thatz ∈ Ω(y).One may interpret this definition as saying that the setΩ(y) is“bigger” than the sets in the sequenceΩ(yj). On the otherhand, the mapΩ is openat y in Y whenever the statementsyj → y and z ∈ Ω(y) together imply the existence of anumberJ and a sequence of pointszj such thatzj → z andzj ∈ Ω(yj) for all j ≥ J . More or less, this statement meansthat the setΩ(y) is “smaller” than the sets in the sequenceΩ(yj). A map which is both open and closed aty is saidto becontinuousat y. We callΩ an open map, closed maporcontinuous mapwhenever it has the corresponding propertyfor every point inY .

Finite Cartesian products and finite compositions of openmaps are open. Finite Cartesian products of closed maps areclosed. IfΩyz : Y → 2Z andΩzw : Z → 2W are closed andZ is compact, then the composition(Ωzw Ωyz) is closed.

C. Fixed Points

Suppose thatΩ is a point-to-set map fromY to itself. Lety be a point ofY for which Ω(y) = y. Then y is calleda fixed pointof the mapΩ. In contrast, ageneralized fixedpoint of Ω is a point for whichy ∈ Ω(y). When we wishto emphasize the distinction, we may refer to a regular fixedpoint as astrongor classicalfixed point.

D. Infimal Maps

Minimizing functions leads to a special type of point-to-setmap. Suppose thatf : Y ×Z → R+ is a real-valued functionof two variables, and letΩ be a point-to-set map fromY toZ . Associated withf andΩ is an infimal mapdefined by

Mz(y) def= arg minz∈Ω(y)

f(y, z).

If f(y, ·) attains no minimal value onΩ(y), thenMz(y) = ∅,the empty set. Under mild conditions, infimal maps are closed.

Theorem 11 (Dantzig-Folkman-Shapiro [72]):If Ω is con-tinuous aty and f(y, ·) is continuous onΩ(y), then Mz isclosed aty.

In particular, the constant mapΩ : y 7→ Z is continuouswheneverZ is closed. So minimizing a continuous functionover a fixed, closed set always yields a closed infimal map.

E. Iterative Algorithms

Zangwill was apparently the first to recognize that manyprocedures in mathematical programming find their mostnatural expression in the language of point-to-set maps [69].An algorithmic mapor algorithm is simply a functionΩ :Y → 2Y . Given an initial pointy0 of Y , an algorithmic mapgenerates a sequence of iterates according to the rule

yj+1 ∈ Ω(yj).

Suppose thatf : Y → R+ is a continuous, non-negativefunction. We say that the algorithmΩ is monotonic withrespect tof when

z ∈ Ω(y) implies f(z) ≤ f(y).

An algorithmstrictly monotonicwith respect tof is a mono-tonic algorithm for which

z ∈ Ω(y) andf(z) = f(y) imply z = y.

Zangwill showed that a closed, monotonic algorithm convergesin a weak sense to a generalized fixed point. We present astreamlined version of his result.

Theorem 12 (Zangwill [69]):Let Ω be a closed algorithmicmap on a compact setY , and assume thatΩ is monotonic withrespect to a continuous, non-negative functionf . Suppose thatthe algorithm generates a sequence of iteratesyj.• The sequence has at least one accumulation point inY .• Each accumulation pointy satisfiesf(y) = limj f(yj).• Each accumulation pointy is a generalized fixed point of

the algorithm.R. R. Meyer subsequently extended Zangwill’s Theorem

to provide a more satisfactory convergence result for strictly

15

monotonic algorithms. One version of his result follows.For reference, a sequenceyj in a normed space is calledasymptotically regularwhen‖yj+1 − yj‖ → 0.

Theorem 13 (Meyer [67]):Let Y be a compact subset ofa normed space, and assume thatΩ is a closed algorithm onY that it is strictly monotonic with respect to the continuous,non-negative functionf . Suppose thatΩ generates a sequenceof iteratesyj. In addition to the conclusions of Zangwill’sTheorem, the following statements hold.

• Each accumulation point of the sequence is a (strong)fixed point of the algorithm.

• The sequence of iterates is asymptotically regular. Inconsequence, it has a continuum of accumulation points,or it converges in norm [73].

• In case that the fixed points ofΩ on each isocontourof f form a discrete set, then the sequence of iteratesconverges in norm.

F. Alternating Projection

An alternating projection can be interpreted as a kind ofmonotonic algorithm. Suppose thatf : Y × Z → R+ is acontinuous function. Thenf induces two natural infimal maps,

My(z) def= arg miny∈Y

f(y, z) and

Mz(y) def= arg minz∈Z

f(y, z).

If Y andZ are closed, then Theorem 11 shows that the mapsMy andMz are both closed.

We interpret alternating projection as an algorithm on theproduct spaceY ×Z equipped with the usual product topol-ogy. Given an initial iteratey0 from Y , alternating projectiongenerates a sequence of iterates(yj , zj) via the rules

zj ∈ Mz(yj) and yj+1 ∈ My(zj)

for eachj ≥ 0. Formally, this algorithm can be written asthe composition of two sub-algorithms,Ωto andΩfro, that aredefined as

Ωto : (y, z) 7→ y ×Mz(y) and

Ωfro : (y, z) 7→ My(z)× z.

It follows thatΩ def= Ωfro Ωto is a closed algorithm wheneverY andZ are compact. Both sub-algorithms decrease the valueof f , so it should also be clear thatΩ is monotonic with respectto f . Zangwill’s Theorem tenders a basic convergence result.

Corollary 14: Let Y andZ be compact. Suppose that thealternating projection betweenY andZ generates a sequenceof iterates(yj , zj).• The sequence has at least one accumulation point.• Each accumulation point of the sequence lies inY ×Z .• Each accumulation point is a generalized fixed point of

the algorithm.• Each accumulation point(y, z) satisfies f(y, z) =

limj f(yj , zj).

If the infimal mapsMy andMz are single-valued, we canachieve a much more satisfactory result.

Corollary 15: Let Y and Z be compact subsets of anormed space, and assume that the infimal mapsMy andMz are single-valued. Suppose that the alternating projectionbetweenY andZ generates a sequence of iterates(yj , zj).In addition to the conclusions of Corollary 14, we have thefollowing.• Each accumulation point is a classical fixed point of the

alternating projection.• The sequence of iterates is asymptotically regular.• The sequence of iterates either converges in norm or it

has a continuum of accumulation points.Proof: We just need to show that the algorithm is

strictly monotonic with respect tof . Suppose thatf(y, z) =f(Ω(y, z)). Since the infimal maps never increase the valueof f , we have the equalities

f(y, z) = f(y, Mz(y))= f((My Mz)(y),Mz(y)) = f(Ω(y, z)).

Since Mz yields the unique minimizer of f with its firstargument fixed, the first equality implies thatMz(y) = z.Likewise, the second equality yields(My Mz)(y) = y.That is, Ω(y, z) = (y, z). An application of Meyer’sTheorem completes the argument.

This result is a special case of a theorem of Fiorot andHuard [74]. In Appendix II, we shall translate the language ofthese corollaries into more familiar terms.

G. Literature on Alternating Projection

Like most good ideas, alternating projection has a long biog-raphy and several aliases, includingsuccessive approximation,successive projection, alternating minimizationandprojectionon convex sets. This section offers a resume of the researchon alternating projection, but it makes no pretension to becomprehensive. Deutsch has written more detailed surveys,including [52], [75], [76].

According to Deutsch [75], alternating projection first ap-peared in a set of mimeographed lecture notes, written by Johnvon Neumann in 1933. von Neumann proved that the alter-nating projection between two closed subspaces of a Hilbertspace converges pointwise to the orthogonal projector ontotheir intersection [39]. Apparently, this theorem was not verywell advertised, because many other authors have discoveredit independently, including Aronszajn [49] and Wiener [77]. Itwas shown by Aronszajn [49] and Kayalar–Weinert [50] thatboth sequences of iterates converge geometrically with a rateexactly equal to the squared cosine of the (Friedrichs) principalangle between the two subspaces.

It is natural to extend the alternating projection betweentwo subspaces by cyclically projecting onto several subspaces.Halperin demonstrated that, in a Hilbert space, the cyclic pro-jection among a finite number of closed subspaces convergespointwise to the orthogonal projector onto their intersection[78]. The convergence is geometric [79]. Optimal bounds onthe rate of convergence can be computed with techniques ofXu and Zikatonov [80]. Bauschkeet al. study methods foraccelerating cyclic projection in the recently minted paper[81].

16

It will come as no surprise that researchers have alsostudied alternating projection between subspaces of a Banachspace. Unaware of von Neumann’s work, Diliberto and Strausintroduced an alternating method for computing the best sup-norm approximation of a bivariate continuous function as thesum of two univariate continuous functions, and they provedsome weak convergence results [82]. The norm convergenceof the sequence of iterates remained open until the workof Aumann [83]. M. Golomb extended the Diliberto–Strausalgorithm to other best-approximation problems [84]. For moreinformation on alternating algorithms in Banach spaces, seethe monograph of Cheney and Light [85].

Another fruitful generalization is to consider projectiononto convex subsets. The projector—orproximity map—ontoa closed, convex subset of a Hilbert space is well-defined,because each point has a unique best approximation from thatset. The basic result, due to Cheney and Goldstein, is that thealternating projection between two closed, convex subsets ofa Hilbert space will converge to a pair of points at minimaldistance from each other, so long as one set is compact [4].Dykstra [86], [87] and Han [88] independently developed acyclic projection technique that, given a point, can compute itsbest approximation from the intersection of a finite number ofclosed, convex sets in a Hilbert space. Their algorithm requiresa correction to each projection. To date, the most extensivetreatment of cyclic projection methods is the survey article byBauschke and Borwein [89].

Most of the work on alternating projection has involved theEuclidean distance, but it is possible to develop results forother divergence measures. In particular, Csiszar and Tusnadyhave shown that alternate minimization of the Kullback-Leibler divergence can be used to find a pair of minimallydistant points contained within two convex sets of probabilitymeasures [68].

There has been some research on alternating projectionbetween non-convex sets, but the theoretical results so farare limited. Fiorot and Huard have applied the theorems ofZangwill and Meyer to obtain weak convergence results for aclass of block relaxation schemes that include alternating andcyclic projection onto non-convex sets [74]. Combettes andTrussell have developed a technique which inflates the non-convex sets into convex sets; they offer some qualified conver-gence results [90]. Cadzow has also demonstrated empiricallythat cyclic projections among non-convex sets can effectivelysolve some signal enhancement problems [91]. More researchin this direction would be valuable.

Alternating projection has found application to many differ-ent problems, of which we offer a (small) selection. The mostfamous example from these pages must be the Blahut–Arimotoalgorithm for computing channel capacity and rate-distortionfunctions [92], [93]. In the field of signal restoration and recov-ery, we mention the work of Landau–Miranker [94], Gerchberg[95], Youla–Webb [96], Cadzow [91] and Donoho–Stark [97].Cetin, Gerek and Yardimci show that projection on convexsets can compute multi-dimensional equiripple filters [98]. Xuand Zikatonov discuss how alternating projection can be usedto solve the linear systems that arise in the discretizationof partial differential equations [80]. In the matrix analysis

community, alternating projection has been used as a compu-tational method for solving inverse eigenvalue problems [36],[38] and for solving matrix nearness problems [37], [99]. Instatistics, one may view the Expectation Maximization (EM)algorithm as an alternating projection [100]. de Leeuw hasdiscussed other statistical applications in [51].

APPENDIX IICONVERGENCE ANDFIXED POINTS

Armed with the theory of the last appendix, we are finallygirded to attack the convergence of Algorithm 1. The resultson point-to-set maps will allow us to dispatch this dragonquickly. Then we shall turn our attention to the convergenceof the algorithm in the special case that the frame vectors haveprescribed norms. This problem will require a longer siege, butit, too, will yield to our onslaught. The convergence results thatwe develop here are all novel.

A. Basic Convergence Proof

In this section, we establish the convergence of the basicalternating projection algorithm that appears in Section III-D.Our main burden is to translate the language of point-to-setmaps into more familiar terms.

Theorem 16 (Global Convergence):Let Y and Z beclosed sets, one of which is bounded. Suppose that alternatingprojection generates a sequence of iterates(Yj ,Zj). This se-quence possesses at least one accumulation point, say(Y ,Z ).• The accumulation point lies inY ×Z .• The accumulation point satisfies∥∥Y − Z

∥∥F

= limj→∞

‖Yj − Zj‖F .

• The accumulation point satisfies∥∥Y − Z∥∥

F= dist(Y ,Z ) = dist(Z ,Y ).

Proof: Assume without loss of generality thatY is thecompact set, whileZ is merely closed. We must establish thatwe have all the compactness necessary to apply Corollary 14.

Without loss of generality, assume thatY0 ∈ Y . If δ =‖Y0 − Z0‖F, then subsequent iterates always satisfy

‖Yj − Zj‖F ≤ δ and

‖Yj+1 − Zj‖F ≤ δ.

Thus, we may restrict our attention to the sets

Y1 = Y ∈ Y : dist(Y ,Z ) ≤ δ and

Z1 = Z ∈ Z : dist(Z ,Y ) ≤ δ.

Since Y is compact,Y1 is compact because it is a closedsubset of a compact set. On the other hand,Z1 is compactbecause it is the intersection of the closed setZ with acompact set, namely the collection of matrices within a fixeddistance ofY .

We may apply Corollary 14. Each of the conclusionsof the corollary has a straightforward analogue among theconclusions of the present theorem. The only question thatmay remain is what it means for a pair of matrices(Y ,Z )to be a generalized fixed point of the alternating projection.

17

A generalized fixed point of an algorithm is a point whichis a possible successor of itself. In the present case, a pair ofmatrices can succeed itself if and only if the second componentis a potential successor of the firstand the first component isa potential successor of the second. The matrixZ can succeedthe matrixY if and only if∥∥Z − Y

∥∥F

= dist(Y ,Z ).

Likewise, Y can succeedZ if and only if∥∥Y − Z∥∥

F= dist(Z ,Y ).

This observation completes the proof.Since the collection ofα-tight frames and the collection of

their Gram matrices are both compact, the theorem has twoimmediate corollaries.

Corollary 17: If Xα is the collection ofα-tight frames, andS is a closed set of matrices, then Theorem 16 applies withY

def= S andZdef= Xα.

Corollary 18: If Gα contains the Gram matrices of allα-tight frames, andH is a closed set of Hermitian matrices,then Theorem 16 applies withY

def= Gα andZdef= H .

B. Stronger Convergence Results

Meyer’s Theorem suggests that it might be possible toprovide a stronger convergence result for Algorithm 1 if wecan ensure that the matrix nearness problems have uniquesolutions. In many cases, the nearness problems are uniquelysoluble whenever the iterates get sufficiently close together.This provides a local convergence result that is much strongerthan Zangwill’s Theorem allows. First, we prove a generalversion of this result. Afterward, we show that it appliesto an alternating projection that involves one of the spectralconstraint setsXα or Gα.

Recall that the distance between a matrixM and a setYis defined as

dist(M,Y ) def= infY∈Y

‖M − Y ‖F .

Theorem 19:Let Y andZ be closed sets of matrices, oneof which is compact. Suppose that the alternating projectionbetweenY andZ generates a sequence of iterates(Yj ,Zj),and assume that the matrix nearness problems

minY∈Y

‖Y −M‖F

minZ∈Z

‖Z −M‖F

have unique solutions for any matrixM in the sequence ofiterates. Then we reach the following conclusions.

• The sequence of iterates possesses at least one accumu-lation point, say(Y ,Z ).

• The accumulation point lies inY ×Z .• The pair(Y ,Z ) is a fixed point of the alternating projec-

tion. In other words, if we applied the algorithm toY orto Z every iterate would equal(Y ,Z ).

• The accumulation point satisfies∥∥Y − Z∥∥

F= lim

j→∞‖Yj − Zj‖F .


‖Yj+1 − Yj‖F → 0 and ‖Zj+1 − Zj‖F → 0.

• Either the component sequences both converge in norm,∥∥Yj − Y∥∥

F→ 0 and

∥∥Zj − Z∥∥

F→ 0,

or the set of accumulation points forms a continuum.Proof: The argument in the proof of Theorem 16 shows

that we are performing an alternating minimization betweentwo compact sets. The hypotheses of the theorem guaranteethat each iterate is uniquely determined by the previous iterate.Corollaries 14 and 15 furnish the stated conclusions.

The only point that may require clarification is what it takesfor a pair of matrices(Y ,Z ) to be a classical fixed point of thealternating projection. A classical fixed point of an algorithm isthe only possible successor of itself. In the case of alternatingprojection, the matrixZ must be the unique successor of theY , and the matrixY must be the unique successor ofZ . Thisobservation completes the argument.

Due to the peculiar structure of the spectral constraintsets Xα and Gα, the solutions to the associated matrixnearness problems are often unique. Therefore, the alternatingprojection algorithms that we have considered in this papersometimes have better performance than the basic convergenceresult, Theorem 16, would predict.

We remind the reader that

Xαdef= X ∈ Cd×N : XX ∗ = α Id, and

Gαdef= G ∈ CN×N : G = G∗,

andG has eigenvalues(α, . . . , α︸︷︷︸d

, 0, . . . , 0).

The uniqueness of the matrix nearness problems will followfrom the Wielandt–Hoffman Theorem, a powerful result frommatrix analysis.

Theorem 20 (Wielandt–Hoffman [25]):Suppose thatA andB are N × N Hermitian matrices, and let the vectorsλ(A)and λ(B) list the eigenvalues ofA and B in algebraicallynon-increasing order. Then

‖λ(A)− λ(B)‖2 ≤ ‖A− B‖F .

Suppose instead thatA andB ared×N rectangular matriceswith d ≤ N , and letσ(A) andσ(B) list the largestd singularvalues ofA andB in non-increasing order. Then

‖σ(A)− σ(B)‖2 ≤ ‖A− B‖F .

Note that if we solving matrix nearness problems with respectto the spectral norm, Weyl’s Theorem would allow us toprovide stronger bounds [25].

Corollary 21 (Local Convergence with ConstraintXα):Let S be a closed set ofd × N matrices for which theassociated matrix nearness problem

minS∈S

‖S −M‖F

has a unique solution wheneverdist(M,S ) < ε. Imaginethat the alternating projection betweenS and Xα generatesa sequence of iterates(Sj ,Xj) in which

‖Sj − Xj‖F < minε, α for some indexJ.

18

Then the conclusions of Theorem 19 are in force.Proof: According to Theorem 2, the matrix inXα nearest

to a matrixM is unique so long asM has full rank. Ad×Nmatrix is rank-deficient only if itsd-th largest singular value iszero. Observe that the largestd singular values of each matrixin Xα all equalα > 0. According to the Wielandt–HoffmanTheorem, any matrix sufficiently close toXα cannot be rank-deficient. More precisely,dist(M,Xα) < α implies thatMhas full rank, which in turn shows thatM has a unique bestapproximation inXα.

Define the constraint sets

Ydef= S ∩ closureSj : j ≥ J and

Zdef= Xα ∩ closureXj : j ≥ J.

Note thatY is closed and thatZ is compact. We will applyTheorem 19 to the tail of the sequence of iterates, beginningwith indexJ . For j ≥ J , each matrixSj is close enough toZand each matrixXj is close enough toY that we can ensurethe matrix nearness problems have unique solutions.

Corollary 22 (Local Convergence with ConstraintGα):Let H be a closed set ofN × N matrices for which theassociated matrix nearness problem

minH∈H

‖H −M‖F

has a unique solution wheneverdist(M,H ) < ε. Imaginethat the alternating projection betweenGα and H generatesa sequence of iterates(Gj ,Hj) in which

‖Gj − Hj‖F < minε, α/√

2 for some indexJ.

The conclusions of Theorem 19 are in force.Proof: Theorem 3 indicates that the matrix inGα nearest

to a matrix M is unique so long as itsd-th and (d + 1)-steigenvalues are distinct. Imagine thatM is a matrix whosed-th and(d + 1)-st eigenvalues both equalτ . Since thed-thand (d + 1)-st eigenvalue of a matrix inGα are α and zero,the Wielandt–Hoffman Theorem shows that

dist(M,Gα)2 ≥ (α− τ)2 + τ2.

Varying τ , the minimum value of the right-hand side isα2/2.Therefore,dist(M,Gα) < α/

√2 implies that thed-th and

(d + 1)-st eigenvalues ofM are distinct. In consequence,Mhas a unique best approximation fromGα.

As before, define the constraint sets

Ydef= H ∩ closureHj : j ≥ J and

Zdef= Gα ∩ closureGj : j ≥ J.

The set Y is closed, andZ is compact. We will applyTheorem 19 to the tail of the sequence of iterates, beginningwith indexJ . For j ≥ J , each matrixHj is close enough toZand each matrixGj is close enough toY that we can ensurethe matrix nearness problems have unique solutions.

C. Specified Column Norms

This section offers a detailed analysis of the alternatingprojection between the set ofα-tight frames and the collectionof matrices with specified column norms.

Let c1, . . . , cN be positive numbers that denote the squaredcolumn norms we desire in the frame. Without loss of gener-ality, we assume that

∑n cn/d = 1 to streamline the proofs.

Then the structural constraint set is

Sdef= S ∈ Cd×N : ‖sn‖2

2 = cn.

The tightness parameter of the frameα must equal one, so wedefine the set of 1-tight frames as

X1def= X ∈ Cd×N : XX ∗ = Id.

Suppose thatS0 is a full-rank matrix drawn fromS , andperform an alternating projection between the setsS andX1

to obtain sequencesSj and Xj. Proposition 23 of thesequel shows that the sequenceSj lies in a compact subsetof S whose elements have full rank, while the sequenceXjlies in a compact subset ofX1 whose elements have non-zero columns. By an appeal to the matrix nearness results,Theorem 2 and Proposition 5, we see that each iterate isuniquely determined by its predecessor. We may thereforeapply Corollary 15.

In this subsection, we complete the foregoing argumentby demonstrating that the iterates are well-behaved. In thenext subsection, we classify the full-rank fixed points of thealternating projection betweenS andX1.

Set cmin = minn cn, and define the diagonal matrixCwhose entries are

√c1, . . . ,

√cN .

Proposition 23: Assume that the initial iterateS0 is a fullrank matrix fromS . For every positive indexj,

1) the Euclidean norm of each column ofXj exceeds√cmin/ ‖C‖F; and

2) the smallest singular value ofSj exceeds√

cmin.

The matrices that satisfy these bounds form compact subsetsof X1 andS .

Proof: Assume thatj ≥ 0, and make the inductiveassumption thatSj has full rank. First, we bound the topsingular value ofSj by exploiting the relationship betweenthe singular values of a matrix and its Frobenius norm. SinceC lists the column norms ofSj , it follows that ‖Sj‖2

F =‖C‖2

F. The squared Frobenius norm also equals the sum ofthe squared singular values ofSj . It is immediate that themaximum singular value ofSj satisfies

σmax(Sj)2 ≤ ‖C‖2F . (9)

Next we use this relation to estimate the column norms ofXj .Let Sj have singular value decompositionUΣV ∗, and writethe n-th columns ofSj andXj assn andxn. On account ofthe fact thatXj = (SjSj

∗)−1/2Sj , we have

‖xn‖2 =∥∥∥(SjSj

∗)−1/2 sn

∥∥∥2

=∥∥UΣ−1U∗ sn

∥∥2

≥√

cmin/σmax(Sj),

(10)

19

since the norm ofsn is at least√

cmin. Introducing the estimate(9) into (10) yields the first part of the proposition.

Now, we show that the smallest singular value ofSj+1

remains well away from zero. The Courant–Fischer Theoremfor singular values [25] states that one may calculate thek-thlargest singular value of a matrixB ∈ Cd×N as

σk(B) = maxZ

minz 6=0

‖B z‖2

‖z‖2

,

whereZ ranges over allk-dimensional subspaces ofCN andz ∈ Z [25]. DefineTj to be the diagonal matrix that lists thecolumn norms ofXj . Therefore, the nearest matrix inS canbe written asSj+1 = Xj T−1

j C . Then put

Zdef= C−1Tj x : x ∈ rowspanXj.

Since Xj has full row-rank,Z forms a d-dimensional sub-space. Select a unit vectorz from Z , and express it asz = C−1Tj x for somex in rowspanXj . By construction,Xj has orthonormal rows, so we may compute

‖Sj+1 z‖2 =∥∥XjT

−1j C z

∥∥2

= ‖Xj x‖2

= ‖x‖2 (sincex ∈ rowspanXj)

=∥∥T−1

j C z∥∥

2.

The matrixXj is a submatrix of a unitary matrix, so its columnnorms cannot exceed one. Thus every entry ofT−1

j must beat least one. It follows that

‖Sj+1 z‖2 ≥√

cmin.

Applying the Courant-Fischer Theorem yields

σmin(Sj+1) ≥√

cmin.

The second part of the proposition is complete.Finally, we must make the compactness argument. We have

shown that the squared singular values of an iterateSj must liein the closed interval[

√cmin, ‖C‖F]. The minimum squared

singular value of a matrix is a continuous function of thematrix entries, which follows from the Wielandt–HoffmanTheorem. Therefore, the matrices whose smallest singularvalue lies in this interval form a closed set. We conclude thatthe intersection of this set with the compact setS is compact.The same argument implies that the sequenceXj lies in acompact subset ofX1 whose matrices have column normsbounded away from zero.

D. Fixed Points I

It remains to characterize the fixed points of the alternatingprojection between the set of matrices with fixed columnnorms and the set ofα-tight frames.

Proposition 24: The full-rank stationary points of an alter-nating projection betweenS andXα are precisely those full-rank matricesS from S whose columns are all eigenvectorsof SS∗. That is,SS∗S = SΛ, whereΛ ∈ CN×N is diagonaland positive with at mostd distinct values.

Proof: Define the diagonal matrixT = T (S) whoseentries are the column norms ofα (SS∗)−1/2 S .

Suppose thatS is a full-rank fixed point of the algorithm.Thus projectingS ontoXα and projecting back toS returnsS . Symbolically,

S = (α (SS∗)−1/2 S) (T−1C ).

Define Λ = α T−1C . Then the equation becomes(SS∗)−1/2 S = SΛ−1. Due to the joint eigenstructure of apositive-definite matrix and its positive-definite roots [25], itfollows that (SS∗) S = SΛ2.

Conversely, suppose thatS has full rank and that(SS∗) S =SΛ2 for some positive diagonal matrixΛ. Equivalently,(SS∗) sn = λ2

n sn for eachn. It follows that

(SS∗)−1/2 sn = λ−1n sn for eachn.

Multiply by α, and take norms to see thattn = α λ−1n

√cn.

Combine these equations into the matrix equationΛ =α T−1C . It follows that S is a fixed point of the algorithm.

E. Fixed Points II

Proposition 24 allows us to provide a partial characterizationof the fixed points of any alternating projection between theset ofα-tight framesXα and any structural constraint setZthat contains matrices with fixed column norms. This resultapplies even if the matrices inZ have additional properties.

Proposition 25: Suppose that the column norms of matricesin Z are fixed. A sufficient condition for a full-rank matrixZin Z to be a fixed point of the alternating projection betweenZ and Xα is that the columns ofZ are all eigenvectors ofZZ∗. That is,ZZ∗Z = ZΛ, whereΛ is a positive, diagonalmatrix with no more thand distinct entries.

Proof: Let Z be a closed subset of the closed setS , acollection of matrices with prescribed column norms. Supposethat P (·) is a sufficient condition forS to be a fixed point ofthe alternating projection betweenS and Xα. Assume thatP (Z ) for a matrix Z in Z , and letX be the matrix inXα

closest toZ . SinceP (Z ) andZ ∈ S , it follows thatZ is thematrix in S closest toX . Therefore,Z is also a fixed point ofthe alternating minimization betweenZ andXα. An appealto Proposition 24 completes the proof.

REFERENCES

[1] O. Christensen,An introduction to frames and Riesz bases. Boston:Birkhauser, 2003.

[2] J. A. Tropp, R. W. Heath, and T. Strohmer, “Optimal CDMA signaturesequences, inverse eigenvalue problems and alternating projection,” inProceedings of the 2003 IEEE International Symposium on InformationTheory, Yokohama, July 2003, p. 407.

[3] J. A. Tropp, I. S. Dhillon, R. W. Heath, and T. Strohmer, “CDMA sig-nature sequences with low peak-to-average-power ratio via alternatingprojection,” in Proceedings of the 37th Annual Asilomar Conferenceon Signals, Systems and Computers, Monterrey, Nov. 2003.

[4] E. W. Cheney and A. A. Goldstein, “Proximity maps for convex sets,”Proc. Amer. Math. Soc., vol. 10, no. 3, pp. 448–450, June 1959.

[5] L. M. Bregman, “Finding the common point of convex sets by themethod of successive projection,”Dokl. Akad. Nauk. USSR, vol. 162,pp. 487–490, 1965.

[6] M. Rupf and J. L. Massey, “Optimum sequence multisets for syn-chronous code-division multiple-access channels,”IEEE Trans. Inform.Theory, vol. 40, no. 4, pp. 1261–1266, July 1994.

20

[7] P. Viswanath and V. Anantharam, “Optimal sequences and sum capacityof synchronous CDMA systems,”IEEE Trans. Inform. Theory, vol. 45,no. 6, pp. 1984–1991, Sept. 1999.

[8] P. Viswanath and A. Anantharam, “Optimal sequences for CDMAunder colored noise: A Schur-saddle function property,”IEEE Trans.Inform. Theory, vol. 48, no. 6, pp. 1295–1318, June 2002.

[9] L. Zheng and D. N. Tse, “Communication on the Grassmann manifold:A geometric approach to the non-coherent multiple-antenna channel,”IEEE Trans. Inform. Theory, vol. 48, no. 2, pp. 359–383, Feb. 2002.

[10] R. W. Heath, T. Strohmer, and A. J. Paulraj, “Onquasi-orthogonal signatures for CDMA systems,” Sept. 2002,submitted to IEEE Trans. Inform. Theory. [Online]. Available:http://www.ece.utexas.edu/˜rheath/papers/2002/quasicdma/

[11] V. Tarokh and I.-M. Kim, “Existence and construction of noncoherentunitary space-time codes,”IEEE Trans. Inform. Theory, vol. 48, no. 12,pp. 3112–3117, Dec. 2002.

[12] T. Strohmer and R. W. Heath, “Grassmannian frames with applicationsto coding and communication,”Appl. Comp. Harmonic Anal., vol. 14,no. 3, pp. 257–275, May 2003.

[13] R. Gribonval and M. Nielsen, “Sparse representations in unions ofbases,” Institut de Recherce en Informatique et Systemes Aleatoires,Tech. Rep. 1499, Nov. 2002.

[14] J. A. Tropp, “Greed is good: Algorithmic results for sparse approxi-mation,” University of Texas at Austin,” ICES Report, Feb. 2003.

[15] D. L. Donoho and M. Elad, “Maximal sparsity representation by`1minimization,” Proc. Natl. Acad. Sci., vol. 100, pp. 2197–2202, Mar.2003.

[16] D. V. Sarwate, “Meeting the Welch Bound with equality,” inSequencesand their Applications. London: Springer, 1998, pp. 79–102.

[17] P. G. Casazza and M. T. Leon, “Existence and construction of finitetight frames,” 2002, unpublished.

[18] J. A. Tropp, I. S. Dhillon, and R. W. Heath, “Finite-step algorithms for constructing optimal CDMA signaturesequences,” The University of Texas at Austin, WirelessNetworking and Communications Group Technical Report WNCG-TR-2003-05-08, May 2003, submitted. [Online]. Available:http://www.ece.utexas.edu/˜rheath/papers/2003/finitestep/

[19] C. Rose, “CDMA codeword optimization: interference avoidance andclass warfare,”IEEE Trans. Inform. Theory, vol. 47, no. 6, pp. 2368–2382, Sept. 2001.

[20] S. Ulukus and R. D. Yates, “Iterative construction of optimum signaturesequence sets in synchronous CDMA systems,”IEEE Trans. Inform.Theory, vol. 47, no. 5, pp. 1989–1998, 2001.

[21] C. Rose, S. Ulukus, and R. D. Yates, “Wireless systems and interferenceavoidance,”IEEE Trans. Wireless Comm., vol. 1, no. 3, pp. 415–428,Jul. 2002.

[22] S. Ulukus and A. Yener, “Iterative joint optimization of CDMA signa-ture sequence and receiver filters,” in2002 Conference on InformationSciences and Systems, Princeton Univ., Mar. 2002.

[23] P. Anigstein and V. Anantharam, “Ensuring convergence of the MMSEiteration for interference avoidance to the global optimum,”IEEETrans. Inform. Th., vol. 49, no. 4, pp. 873–885, Apr. 2003.

[24] L. R. Welch, “Lower bounds on the maximum cross-correlation ofsignals,” IEEE Trans. Inform. Theory, vol. 20, pp. 397–399, 1974.

[25] R. A. Horn and C. R. Johnson,Matrix Analysis. Cambridge UniversityPress, 1985.

[26] A. W. Marshall and I. Olkin,Inequalities: Theory of majorization andits applications. Academic Press, 1979.

[27] P. G. Casazza, M. Leon, and J. C. Tremain, “Custom built tight frames,”2002, unpublished.

[28] W. M. Boothby, An Introduction to Differentiable Manifolds andRiemannian Geometry. Academic Press, 2003.

[29] J. H. Conway, R. H. Hardin, and N. J. A. Sloane, “Packing lines,planes, etc.: Packings in Grassmannian spaces,”Experimental Math.,vol. 5, no. 2, pp. 139–159, 1996.

[30] M. T. Chu, “Inverse eigenvalue problems,”SIAM Rev., vol. 40, no. 1,pp. 1–39, 1998.

[31] M. T. Chu and G. H. Golub, “Structured inverse eigenvalue problems,”Acta Numerica, 2001.

[32] S. Friedland, J. Nocedal, and M. L. Overton, “The formulation andanalysis of numerical methods for inverse eigenvalue problems,”SIAMJ. Numer. Anal., pp. 634–667, 1987.

[33] M. T. Chu, “Numerical methods for inverse singular value problems,”SIAM J. Numer. Anal., vol. 29, pp. 885–903, 1992.

[34] ——, “On the refinement of a Newton method for the inverse Toeplitzeigenvalue problem,” North Carolina State University, Tech. Rep.,1994.

[35] M. T. Chu and K. R. Driessel, “The projected gradient method forleast squares matrix approximations with spectral constraints,”SIAMJ. Numer. Anal., vol. 27, no. 4, pp. 1050–1060, Aug. 1990.

[36] M. T. Chu, “Constructing a Hermitian matrix from its diagonal entriesand eigenvalues,”SIAM J. Matrix Anal. Appl., vol. 16, pp. 207–217,1995.

[37] N. J. Higham, “Computing the nearest correlation matrix—a problemfrom finance,”IMA J. Numer. Anal., vol. 22, pp. 329–343, 2002.

[38] M. T. Chu, R. Funderlic, and R. J. Plemmons, “Structured low-rankapproximation,”Linear Algebra Appl., vol. 366, pp. 157–172, 2003.

[39] J. von Neumann,Functional Operators, Vol. II, ser. Annals of Mathe-matics Studies. Princeton Univ. Press, 1950, no. 22.

[40] G. H. Golub, A. Hoffman, and G. W. Stewart, “A generalization of theEckart-Young-Mirsky matrix approximation theorem,”Linear AlgebraAppl., vol. 88/89, pp. 317–327, 1987.

[41] A. Ruhe, “Closest normal matrix finally found!”BIT, vol. 27, pp. 585–598, 1987.

[42] N. J. Higham, “Computing a nearest symmetric positive semi-definitematrix,” Linear Algebra Appl., vol. 103, pp. 103–118, 1988.

[43] ——, Applications of Matrix Theory. Oxford Univ. Press, 1989, ch.Matrix nearness problems and applications, pp. 1–27.

[44] G. H. Golub and C. F. Van Loan,Matrix Computations, 3rd ed. JohnsHopkins University Press, 1996.

[45] B. F. Green, “The orthogonal approximation of an oblique structure infactor analysis,”Psychometrika, vol. 17, pp. 429–440, 1952.

[46] P. H. Schonemann, “A generalized solution of the orthogonal Procrustesproblem,” Psychometrika, vol. 31, pp. 1–10, 1966.

[47] N. J. Higham, “Computing the polar decomposition—with applica-tions,” SIAM J. Sci. Stat. Comput., no. 7, pp. 1160–1174, 1986.

[48] N. J. Higham and P. Papadimitriou, “A parallel algorithm for computingthe polar decomposition,”Parallel Comput., vol. 20, no. 8, pp. 1161–1173, Aug. 1994.

[49] N. Aronszajn,Introduction to the Theory of Hilbert Spaces, Vol. I.Stillwater, OK: Research Foundation of Oklahoma A & M College,1950.

[50] S. Kayalar and H. Weinert, “Error bounds for the method of alternatingprojections,”Math. Control Signal Systems, pp. 43–59, 1988.

[51] J. de Leeuw,Information Systems and Data Analysis. Berlin: Springer,1994, ch. Block relaxation algorithms in statistics, pp. 308–325.

[52] F. Deutsch,Best Approximation in Inner-Product Spaces. New York:Springer-Verlag, 2001.

[53] J. J. Benedetto and M. Fickus, “Finite normalized tight frames,”Adv.Comp. Math., 2002, to appear.

[54] P. Casazza and J. Kovacevic, “Equal-norm tight frames with erasures,”Adv. Comp. Math., 2002.

[55] V. K. Goyal, M. Vetterli, and N. T. Thao, “Quantized overcompleteexpansions inRn: analysis, synthesis and algorithms,”IEEE Trans.Inform. Theory, vol. 44, no. 1, pp. 16–31, 1998.

[56] M. Sustik, J. A. Tropp, I. S. Dhillon, and R. W. Heath, “Necessaryconditions for existence of equiangular tight frames,” University ofTexas at Austin,” Dept. of Computer Sciences, 2003, in preparation.

[57] A. R. Calderbank, P. J. Cameron, W. M. Kantor, and J. J. Seidel, “Z4-Kerdock codes, orthogonal spreads and extremal Euclidean line sets,”Proc. London Math. Soc., vol. 75, no. 2, pp. 436–480, 1997.

[58] N. J. A. Sloane, “Packing planes in four dimensions and other mys-teries,” in Proceedings of the Conference on Algebraic Combinatoricsand Related Topics, Yamagata, Japan, Nov. 1997.

[59] D. Agrawal, T. J. Richardson, and R. L. Urbanke, “Multiple-antennasignal constellations for fading channels,”IEEE Trans. Inform. Theory,vol. 47, no. 6, pp. 2618–2626, Sept. 2001.

[60] J. Tropp, I. S. Dhillon, R. W. H. Jr., and T. Strohmer, “ConstructingGrassmannian packings via alternating projection,” Dec. 2003, inpreparation.

[61] V. K. N. Lau, “On the analysis of peak-to-average ratio (PAR) for IS95and CDMA2000d systems,”IEEE Trans. Vehicular Tech., vol. 49, no. 6,pp. 2174–2188, Nov. 2000.

[62] R. T. Rockafellar,Convex Analysis. Princeton Univ. Press, 1970.[63] A. Gersho and R. M. Gray,Vector Quantization and Signal Compres-

sion. Kluwer Academic Publishers, 1992.[64] I. S. Dhillon and D. S. Modha, “Concept decompositions for large

sparse text data using clustering,”Machine Learning, vol. 42, no. 1,pp. 143–175, 2001.

[65] I. S. Dhillon, E. Marcotte, and U. Roshan, “Diametrical clustering foridentifying anti-correlated gene clusters,”Bioinformatics, vol. 19, pp.1612–1619, 2003.

21

[66] B. M. Hochwald, T. L. Marzetta, T. J. Richardson, W. Sweldens, andR. Urbanke, “Systematic design of unitary space-time constellations,”IEEE Trans. Inform. Theory, vol. 46, no. 6, pp. 1962–1973, 2000.

[67] R. R. Meyer, “Sufficient conditions for the convergence of monotonicmathematical programming algorithms,”J. Comp. Sys. Sci., vol. 12,pp. 108–121, 1976.

[68] I. Csiszar and G. Tusnady, “Information geometry and alternatingminimization procedures,”Statistics and Decisions, vol. 1, pp. 205–237, 1984, supplemental issue.

[69] W. I. Zangwill, Nonlinear Programming: A Unified Approach. En-glewood Cliffs: Prentice-Hall, 1969.

[70] W. W. Hogan, “Point-to-set maps in mathematical programming,”SIAMRev., vol. 15, no. 3, pp. 591–603, July 1973.

[71] P. Huard, Ed.,Point-to-Set Maps and Mathematical Programming,ser. Mathematical Programming Study. Amsterdam: North-HollandPublishing. Co., 1979, no. 10.

[72] G. B. Dantzig, J. Folkman, and N. Shapiro, “On the continuity of theminimum set of a continuous function,”J. Math. Anal. Appl., vol. 17,pp. 519–548, 1967.

[73] A. M. Ostrovsky, Solution of Equations and Systems of Equations.New York: Academic Press, 1966.

[74] J. C. Fiorot and P. Huard, “Composition and union of general algo-rithms of optimization,” inPoint-to-Set Maps and Mathematical Pro-gramming, ser. Mathematical Programming Studies. North Holland,1979, vol. 10, pp. 69–85.

[75] F. Deutsch,Multivariate Approximation Theory. Basel: BirkhauserVerlag, 1979, ch. The alternating method of von Neumann, pp. 83–96.

[76] ——, Approximation Theory, Spline Functions and Applications. TheNetherlands: Kluwer Academic Publishers, 1992, ch. The method ofalternating orthogonal projections, pp. 105–121.

[77] N. Wiener, “On the factorization of matrices,”Comment. Math. Helv.,vol. 29, pp. 97–111, 1955.

[78] I. Halperin, “The product of projection operators,”Acta Sci. Math.,vol. 23, pp. 96–99, 1962.

[79] K. T. Smith, D. C. Solmon, and S. L. Wagner, “Practical and mathemat-ical aspects of the problem of reconstructing objects from radiographs,”Bull. Amer. Math. Soc., vol. 83, pp. 1227–1270, 1977.

[80] J. Xu and L. Zikatonov, “The method of alternating projections andthe method of subspace corrections in Hilbert space,”J. Amer. Math.Soc., vol. 15, no. 3, pp. 573–597, Apr. 2002.

[81] H. H. Bauschke, F. Deutsch, H. Hundal, and S.-H. Park, “Acceleratingthe convergence of the method of alternating projections,”Trans. Amer.Math. Soc., vol. 355, no. 9, pp. 3433–3461, May 2003.

[82] S. P. Diliberto and E. G. Straus, “On the approximation of a functionof several variables by the sum of functions of fewer variables,”PacificJ. Math., vol. 1, pp. 195–210, 1951.

[83] G. Aumann,Uber approximative Nomographie I, II and III. Bayer.Akad. Wiss. Math.-Nat. Kl. S.B., 1958–1960.

[84] M. Golomb, On Numerical Approximation. Madison: Univ. of Wis-consin Press, 1959, ch. Approximation by functions of fewer variables,pp. 275–327.

[85] E. W. Cheney and W. A. Light,Approximation Theory in TensorProduct Spaces, ser. Lecture Notes in Mathematics. Berlin: Springer-Verlag, 1985, no. 1169.

[86] R. L. Dykstra, “An algorithm for restricted least squares regression,”J. Amer. Stat. Assoc., vol. 78, pp. 837–842, 1983.

[87] J. P. Boyle and R. L. Dykstra,Advances in Order Restricted Inference37, ser. Lecture notes in statistics. Springer, 1985, ch. A methodfor finding projections onto the intersection of convex sets in Hilbertspaces, pp. 28–47.

[88] S.-P. Han, “A successive projection method,”Math. Prog., vol. 40, pp.1–14, 1988.

[89] H. H. Bauschke and J. Borwein, “On projection algorithms for solvingconvex feasibility problems,”SIAM Rev., vol. 38, no. 3, pp. 367–426,1996.

[90] P. L. Combettes and H. J. Trussell, “Method of successive projectionsfor finding a common point of sets in metric spaces,”J. Optim. Theoryand Appl., vol. 67, no. 3, pp. 487–507, 1990.

[91] J. A. Cadzow, “Signal enhancement—a composite property mappingalgorithm,” IEEE Trans. Acoustics, Speech and Signal Processing,vol. 36, no. 1, pp. 49–62, Jan. 1988.

[92] R. Blahut, “Computation of channel capacity and rate distortion func-tions,” IEEE Trans. Inform. Theory, vol. 18, pp. 460–473, 1972.

[93] S. Arimoto, “An algorithm for calculating the capacity of an arbitarydiscrete memoryless channel,”IEEE Trans. Inform. Theory, vol. 18,pp. 14–20, 1972.

[94] H. J. Landau and W. L. Miranker, “The recovery of distorted band-limited signals,”J. Math Anal. Appl., vol. 2, pp. 97–104, 1961.

[95] R. W. Gerchberg, “Super-resolution through error energy reduction,”Optica Acta, vol. 21, pp. 709–720, 1974.

[96] D. C. Youla and H. Webb, “Image restoration by the method of convexprojections. I—theory,”IEEE Trans. Medical Imaging, vol. 1, pp. 81–94, 1982.

[97] D. L. Donoho and P. B. Stark, “Uncertainty principles and signalrecovery,”SIAM J. Appl. Math., vol. 49, no. 3, pp. 906–931, Jun. 1989.

[98] A. E. Cetin,O. N. Gerek, and Y. Yardimci, “Equiripple FIR filter designby the FFT algorithm,”IEEE Signal Process. Mag., Mar. 1997.

[99] W. Glunt and T. L. Hayden, “The nearest “doubly stochastic” matricto a real matrix with the same first moment,”Numer. Linear AlgebraAppl., vol. 5, no. 6, Dec. 1998.

[100] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihoodfrom incomplete data via the EM algorithm,”J. Royal Stat. Soc.,vol. 39, pp. 1–38, 1977.

designing structured tight frames via an alternating projection...

Documents