ieee transactions on signal processing, vol. …users.clas.ufl.edu/hager/papers/gtd/tcd.pdf · and...

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 11, NOVEMBER 2006 4405

Tunable Channel Decomposition for MIMOCommunications Using Channel State Information

Yi Jiang, Member, IEEE, William W. Hager, and Jian Li, Fellow, IEEE

Abstract—We consider jointly designing transceivers for mul-tiple-input multiple-output (MIMO) communications. Assumingthe availability of the channel state information (CSI) at the trans-mitter (CSIT) and receiver (CSIR), we propose a scheme that candecompose a MIMO channel, in a capacity lossless manner, intomultiple subchannels with prescribed capacities, or equivalently,signal-to-interference-and-noise ratios (SINRs). We refer to thisscheme as the tunable channel decomposition (TCD), which isbased on the recently developed generalized triangular decom-position (GTD) algorithm and the closed-form representationof minimum mean-squared-error VBLAST (MMSE-VBLAST)equalizer. The TCD scheme is particularly relevant to theapplications where independent data streams with differentqualities-of-service (QoS) share the same MIMO channel. TheTCD scheme has two implementation forms. One is the combi-nation of a linear precoder and a minimum mean-squared-errorVBLAST (MMSE-VBLAST) equalizer, which is referred to asTCD-VBLAST, and the other includes a dirty paper (DP) pre-coder and a linear equalizer followed by a DP decoder, which werefer to as TCD-DP. We also include the optimal code-divisionmultiple-access (CDMA) sequence design as a special case in theframework of MIMO transceiver designs. Hence, our scheme canbe directly applied to optimal CDMA sequence design, both in theuplink and downlink scenarios.

Index Terms—Channel capacity, channel decomposition, dirtypaper (DP) precoding, generalized triangular decomposition, jointtransceiver design, multiple-input multiple-output (MIMO), op-timal CDMA sequences, quality-of-service, VBLAST.

I. INTRODUCTION

FOR a multiple-input multiple-output (MIMO) communica-tion system, if the communication environment is slowly

time varying, the channel state information (CSI) at transmitter(CSIT) is possible via feedback or the reciprocal principle whentime-division duplex (TDD) is used. In the past several years,considerable research has been devoted to the joint transceiver

Manuscript received July 14, 2005; revised December 13, 2005. The asso-ciate editor coordinating the review of this manuscript and approving it for pub-lication was Dr. Markus Rupp. This work was supported in part by NationalScience Foundation Grants CCR-0104887, 0203270, and 0620286. This paperwas presented in part at IEEE International Conference on Acoustics, Speechand Signal Processing (ICASSP), Philadelphia, PA, March 2005.

Y. Jiang was with the Department of Electrical and Computer Engineering,University of Florida, Gainesville, FL 32611-6130 USA. He is now with theUniversity of Colorado, Boulder, CO 80309 USA (e-mail: [email protected]).

W. W. Hager is with the Department of Mathematics, University of Florida,Gainesville, FL 32611-8105 USA (e-mail: [email protected]).

J. Li is with the Department of Electrical and Computer Engineering, Univer-sity of Florida, Gainesville, FL 32611-6130 USA (e-mail: [email protected]).

Color versions of Figs. 1 and 3–5 are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TSP.2006.880233

design utilizing both CSIT and channel state information at re-ceiver (CSIR) (see, e.g., [1]–[8] and the references therein). Thegoal of the joint transceiver design is to improve the MIMOsystem performance, including the channel throughput and bit-error-rate (BER) performance, through optimizing the precoderand equalizer jointly. Two classes of MIMO transceiver designshave been proposed, including linear transceiver designs [1]–[3]and nonlinear schemes [6], [4], [5]. Except the finite-impulse-re-sponse (FIR) MIMO transceiver designs proposed in [7] and [8]for MIMO channels with intersymbol interference (ISI), almostall the linear MIMO transceiver designs start with applying thesingular value decomposition (SVD) to the MIMO channel ma-trix, which leads to multiple orthogonal eigen-subchannels withchannel gains determined by the singular values of the channelmatrix. We have shown in [4] that if the same constellation isused across all the subchannels, the linear schemes suffer re-markable capacity loss especially when the singular values ofthe channel are very different. In [5], we propose a uniformchannel decomposition (UCD) scheme, which is based on eithera VBLAST equalizer or a dirty paper (DP) precoder, to decom-pose a MIMO channel into several identical subchannels. TheUCD scheme is strictly capacity lossless. That is, the sum ca-pacity of the subchannels of UCD is equal to the capacity of theMIMO channel obtained via SVD with “water filling” powerallocation algorithm. The UCD scheme is an optimal solutionamong all the existing schemes that aim at improving the com-munication quality subject to the power constraints. A similarwork is recently presented in [9].

In this paper, we tackle a new aspect of the MIMO transceiverdesign problem. We regard the MIMO transceiver design as away of decomposing a MIMO channel into multiple subchan-nels with flexibility. The MIMO channel decomposition throughSVD plus “water filling” lacks flexibility despite its optimalityin terms of achieving the maximal overall channel capacity. Thesuccess of UCD motivates a much more flexible channel de-composition approach, namely the tunable channel decompo-sition (TCD) scheme, which is the main focus of this paper.Using the recently developed generalized triangular decomposi-tion (GTD) of matrices [10], we propose the TCD scheme to de-compose a MIMO channel into multiple subchannels with pre-scribed capacities, or equivalently, signal-to-interference-and-noise ratios (SINRs). The TCD scheme has two main propertiessummarized.

1) Given parallel subchannels with capacities, which is obtained through applying

SVD plus power allocation to a rank MIMOchannel, TCD can convert the subchannels into

subchannels with capacities

1053-587X/$20.00 © 2006 IEEE

4406 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 54, NO. 11, NOVEMBER 2006

if and only if majorizes.1 In particular, ,

i.e., the TCD is capacity lossless.2) The TCD scheme has two implementation forms. One

is the combination of a linear precoder and a minimummean-squared-error VBLAST (MMSE-VBLAST) equal-izer, which is referred to as TCD-VBLAST, and the otherincludes a DP precoder and a linear equalizer followed bya DP decoder, which we refer to as TCD-DP.

The optimal design of symbol S-CDMA sequences has beenunder intensive study over the past decade (see, e.g., [11]–[14]).Although MIMO transceiver design and CDMA sequence op-timization have been studied in an apparently independentmanner in the signal processing and information theory com-munities, we show that the CDMA sequence design problemcan be viewed as a special case of the MIMO transceiver design.Hence, the TCD scheme can be applied to the design of optimalCDMA sequences with little modifications. Specifically, theTCD-VBLAST and TCD-DP schemes can be applied to designoptimal CDMA sequences in the uplink (mobile-to-base) anddownlink (base-to-mobile) scenarios, respectively. Our TCDscheme, which is independently motivated by the MIMO trans-ceiver design problem, turns out to be related to the schemeproposed in [14]. The relationship is discussed in Section V.

The remainder of the paper is organized as follows. Section IIintroduces the MIMO flat-fading channel model and some pre-liminary results, including the concept of majorization, theGTD theorem, and the closed-form representation of theMMSE-VBLAST equalizer, which are relevant to the sub-sequent derivation of the TCD scheme. The TCD scheme ispresented in Section III. We discuss in Section IV the ap-plication of the TCD scheme to the problem of multi-taskMIMO communications with QoS constraints, which waspreviously studied in [15] and [16]. In Section V, we formulatethe problem of optimal CDMA sequence design as a specialcase of the MIMO transceiver design. We show how to applythe TCD scheme to the design of optimal CDMA sequences.Section VI gives the conclusions of this paper. Two long proofsare delegated to the appendices.

II. CHANNEL MODEL AND PRELIMINARIES

A. Channel Model

We consider a MIMO communication system with trans-mitting and receiving antennas in a frequency flat-fadingchannel. The sampled baseband signal is given by

(1)

where is the information symbol precoded by thelinear precoder and is the receivedsignal and is the channel matrix with rankand with its th element denoting the fading coefficient be-tween the th transmitting and th receiving antennas. We as-sume , where is the expected value,is the conjugate transpose, and denotes the identity matrix

1The concept of majorization is introduced in Section II.

with dimension . We also assume that is thecircularly symmetric complex Gaussian noise vector. We definethe input SNR as

(2)

where , and stands for the trace of amatrix.

B. Majorization

We introduce several basic concepts and theorems of the ma-jorization theory from [17].

Definition II.1: For , if

(3)

with equality for , where the subscript denotes the thlargest element of the sequence, we say that is majorized by

and denote , or equivalently, .Definition II.2: An matrix is doubly stochastic if its

th entry for , and and.

Theorem II.1: if and only if there exists a doublystochastic matrix such that .

A square matrix is said to be a permutation matrix if eachrow and column has a single one, and all the other entries arezero. There are permutation matrices of size .

Theorem II.2: The permutation matrices constitute the ex-treme points of the set of doubly stochastic matrices. Moreover,the set of doubly stochastic matrices is the convex hull of thepermutation matrices.

It follows from Theorems II.1 and II.2 that the setis the convex hull spanned by the points which are formed

from the permutations of elements of .Definition II.3: For , if

(4)

with equality for , we say that is multiplicatively ma-jorized by and write , or equivalently, .

Obviously, if , then .

C. The GTD Theorem

Now we are ready to introduce the GTD theorem.Theorem II.3 (The GTD Theorem): Let denote a

matrix with rank and singular values . There existsan upper triangular matrix and matrices andwith orthonormal columns such that if and onlyif the diagonal elements of , denoted by , satisfy ,where denotes the componentwise absolute value.

Proof: See [10].In [10], we have also presented a computationally efficient

and numerically stable algorithm to realize the GTD TheoremII.3. The corresponding Matlab code is also included therein.

JIANG et al.: TCD FOR MIMO COMMUNICATIONS USING CSI 4407

D. Closed-Form Representation of MMSE-VBLAST

The derivation of the TCD scheme for MIMO transceiver de-sign relies on the above GTD theorem as well as the closed-formrepresentation of the nulling vectors of MMSE-VBLAST de-tector [18], [19].

Consider the augmented matrix (we remind the reader that)

Applying the QR decomposition yields

(5)

where is an upper triangular matrix with pos-itive diagonal and . We can obtain the MMSEnulling vectors using and as shown in thefollowing lemma [19] (see also the more detailed version [20]).

Lemma II.4: Let denote the columns of andthe diagonal elements of where and

are given in (5). The MMSE nulling vectors are

(6)

It is easy to verify that the output SINR of the thlayer (i.e., the signal corresponding to ) obtained via theMMSE-VBLAST detector is

(7)

where .The following lemma is established in [5].Lemma II.5: The diagonal of given in (5) and

satisfy

(8)

Hence, the capacity of the th layer/subchannel is

(9)

III. TUNABLE CHANNEL DECOMPOSITION

A. TCD-VBLAST

We see from (2) that can always be scaled such that .Hence, without loss of generality, we let in the sequel tosimplify the notation.

Denote the SVD of the rank channel as ,where is a diagonal matrix whose diagonal elements

are the nonzero singular values of . The precoderused in the conventional SVD based linear transceiver designs

is , where is a diagonal matrix with the diagonalelements denoting the power allocation. The precoder trans-forms the MIMO channel into orthogonal subchannels withcapacities [21]

b/s/Hz (10)

For this kind of precoder design, the only way of controlling thecapacity of the subchannels is to change the power allocation .

If we modify the precoder to be2

T (11)

where T denotes the transpose, with ,and T , then it can been readily seen that introducingdoes not change the overall channel capacity. However, it bringsabout much greater flexibility as demonstrated in the followingtheorem.

Theorem III.1 (TCD Theorem): Consider the MIMO channelof (1) with given in (11). For any , letbe a zero vector except that its first elements are replacedby , where . Given any rates

, we can find an orthonormal matrix suchthat the combination of the linear precoder T andthe MMSE-VBLAST detector yields subchannels with capac-ities if and only if .

Recall that it follows from Theorems II.1 and II.2 that theset is the convex hull spanned by the pointswhich are formed from the permutations of the elements of .We can illustrate Theorem III.1 using the following toy example.Consider a MIMO channel with rank . We assumethat the capacities of the three subchannels obtained via SVDare . If , then TCD can decomposethe MIMO channel into three subchannels with a rate vector

if and only if lies in the convex hull

(12)

Here, Co stands for the convex hull defined as

(13)

The gray area in Fig. 1 shows the convex hull of (12) with, and . In this case, the six vertices

lie in the two-dimensional (2-D) plane . An

2Allowing to be complex-valued does not introduce additional flexibility,according to the GTD algorithm.


Fig. 1. Illustration of the capacity lossless region obtainable via TCD. We as-sume K = 3; C = 3; C = 2, and C = 1.

interesting special case is the UCD scheme [5], which achievesthe rate vector corresponding to the center of the convex hull,i.e., .

We now prove Theorem III.1.Proof: Given the precoder of (11), the virtual channel is

T T (14)

where is a diagonal matrix with diagonal ele-ments

(15)

Let the augmented matrix be defined as

T(16)

After some straightforward calculations, we can obtain the SVDof as follows:

... T (17)

where is orthogonal with its first columnsforming and the diagonal matrix contains the singularvalues of

(18)

According to Theorem II.3, we can apply GTD to obtain

T (19)

if and only if the diagonal elements of , which wedenote as , satisfy

(20)

Note that both and in (19) are real-valued matrices becauseis a real-valued diagonal matrix. Inserting (19) into (17)

yields

... T T (21)

Choose T and define

... (22)

Then, (21) can be rewritten as , which is theQR decomposition of . By Lemma II.5, it follows that for

, (20) is equivalent to

(23)

where , denotes the ouput SINR of the th sub-channel, and is given in (18). If

(24)

where with entries , then (20) and (23)hold, which implies the existence of (the first columns of

T).Conversely, suppose that there exists a matrix with orthog-

onal columns such that the linear precoder T andthe MMSE-VBLAST detector yields subchannels with capac-ities . Let be the QR decomposition.Since QR decomposition is a special case of GTD, it followsfrom Theorem II.3 that (20) and consequently (23) hold. Hence,(24) holds since implies .

In the previous derivations of the capacity performance ofTCD, we ignored the error propagation effect which is inherentin the V-BLAST equalizer. This simplified treatment can be jus-tified using Shannon’s theory. According to Shannon’s theory,when the data rate is less than the channel capacity, the de-coding error probability can be made arbitrarily close to zeroif the coding length is sufficiently long. Consequently, if a pow-erful codec is employed at each layer, then the error probabilityof each layer can be arbitrarily low and hence propagation errorcan be made negligible. Moreover, the third numerical examplein Section IV-B suggests that with some proper detection or-dering, the performance degradation due to error propagationmay be slight even for uncoded systems.

The Proof of Theorem III.1 suggests that given the SVD ofand the power loading level , we only need to calculate


TABLE ITCD-VBLAST SCHEME

, and the GTD of given in (19). Then, we imme-diately obtain the linear precoder

T ... (25)

Let denote the first rows of . Then it follows from(22) that

...

... (26)

where is diagonal with its th diagonal elementbeing . According to Lemma II.4, thenulling vectors are calculated as

(27)

where is the th diagonal element of and isthe th column of .

Given the SVD of and the power allocation level , theTCD-VBLAST scheme needs to run the procedures summa-rized in Table I. If , then the TCD-VBLASTscheme requires only flops, given the SVDof the channel matrix. Readers are referred to [5] for the verifi-cation of the flops number.

B. TCD-DP

As a dual form of TCD-VBLAST, the TCD scheme can beimplemented by using a DP precoder, which we refer to asTCD-DP. We exploit the uplink–downlink duality revealed in[22] to obtain TCD-DP. The derivations of TCD-DP is almostthe same as those of UCD-DP in [5]. The only difference is thatthe (54) in [5] is replaced by (recall that here)

.... . .

. . ....

......

(28)

Just like UCD, for a system with large number of substreams,TCD-DP is a better choice than TCD-VBLAST since it is freeof propagation errors. Readers are referred to [5, sec. IV-B] forthe detailed derivations.

IV. MIMO COMMUNICATIONS WITH QOS CONSTRAINTS

We apply the TCD scheme to MIMO communications withprescribed QoS constraints. Suppose we wish to transmit

independent data streams through a MIMO channel. Insteadof multiplexing all the substreams in the time division mannerto share the entire MIMO channel, we apply TCD to decom-pose the MIMO channel into multiple subchannels whose ca-pacities/SINRs meet the QoS requirements of the substreams,and dedicate one subchannel to each substream. In [16], the au-thors studied the same problem. They proposed a linear trans-ceiver design which, similar to TCD, can also control the SINRof each subchannel via designing the precoder. However, thelinear transceiver is capacity lossy and can suffer from consider-able performance degradation compared with our TCD schemeas we will show at the end of this section.

A. Optimal Power Allocation

Given that all the subchannels meet the QoS constraints, weneed to minimize the overall input power, i.e., solve the fol-lowing optimization problem:

subject to

(29)

Here, denotes the QR decomposition and denotesthe vector formed by the diagonal of . According to LemmaII.5, the diagonal of determines the SINRs of the subchannels.Without loss of generality, we assume that .We now consider a problem with more relaxed constraints thanthose of (29), as follows:

subject to

(30)

where stands for the singular values of the augmented ma-trix . In general, for any matrix , we let denote the sin-gular values of . By Theorem II.3, if is feasible in (29), then

is feasible in (30). We now further simplify (30) and show thatits solution provides a solution to (29).

Theorem IV.1: If is the singular value de-composition of , then (30) has a solution having the form

where is a diagonal matrix with di-agonal elements , chosen to solve the problem(see (31), shown at the bottom of the next page). Moreover, if

T is the GTD of in (19), then (29) has the solu-tion T where is a solution of (31) and is thematrix formed by the first columns of T.


Proof: See Appendix A.We now develop an efficient algorithm for solving (31). We

will see that the constraint can be omitted since itis automatically satisfied at a minimizer of (31). To begin, wemake a change of variables to further simplify the formulationof (31). We define

(32)

With these definitions, (31) reduces to (see (33) shown at thebottom of the page). Both the equality constraint and the in-equalities in (31) have been dropped since theseconstraints are automatically satisfied at an optimum. The factthat is established after Lemma IV.2.

We now analyze the structure of the minimizer. By exploitingthe structure, we obtain a fast algorithm for solving (33). We firststudy a similar optimization problem with relaxed constraints.

Lemma IV.1: Any solution of the problem

subject to

(34)

has the property that for each .Proof: We replace the inequalities in (34) by the equivalent

constraints obtained by taking log’s, as follows:

The Lagrangian associated with (34), after this modificationof the constraints, is

By the first-order optimality conditions associated with , thereexists with the property that the gradient of the La-grangian with respect to vanishes. Equating to zero the partialderivative of the Lagrangian with respect to , we obtain therelations

Thus

Hence, .Using Lemma IV.1, we can gain insights into the structure of

a solution to (33).Lemma IV.2: There exists a solution to (33) with the prop-

erty that for some integer

for all for all

for all (35)

In particular, for all .Proof: See Appendix B.

By Lemma IV.2, is a decreasing function of forwhile for . Since is a decreasing

function of , it follows that is a decreasingfunction of for with , while for .Hence, is a decreasing function of . In particular,the constraint in (31) is automatically satisfied bythe associated solution characterized in Lemma IV.2.

(31)

(33)


Fig. 2. Matlab function to solve (33).

We refer to the index in Lemma IV.2 as the “break point.”At the break point, the lower bound constraintchanges from inactive to active. We now use Lemma IV.2 toobtain an algorithm for (33).

Lemma IV.3: Let denote the th geometric mean of the

and let denote an index for which is the largest:

(36)

If , then putting for all is optimal in(33). If , then for all at an optimalsolution of (33).

Proof: See Appendix C.Based on Lemma IV.3, we can use the following strategy to

solve (33). We form the geometric mean described in LemmaIV.3 and we evaluate . If , then we set for

, and we simplify (33) by removing , from theproblem. If , then we set for , andwe simplify (33) by removing , from the problem.The Matlab code implementing this algorithm appearsin Fig. 2.

After obtaining the power loading level, we calculate the precoder and the

nulling vectors according to Table I in Section III.

Fig. 3. Input SNR versus prescribed output SINR. The result is based on theaverage of 500 Monte Carlo trials of an i.i.d. Rayleigh flat-fading channel withM = 5 andM = 6.

Note that one of the possible paths through the routinemakes the leading elements of all equal while setting thetrailing elements to be . This path coincideswith the standard water filling algorithm. In this case, theTCD scheme is optimal in terms of maximizing the overallthroughput given the input power. On the other hand, if somesubstream has a very high prescribed SINR such that thegiven in (MAX) is less than the “break point” , then leadsto a multilevel water filling power allocation, which suffersfrom overall capacity loss. This happens when the target ratevector falls out of the convex hull spanned by the

permutations of (cf. Fig. 1), where, are the capacities of the eigen-subchannels

with standard water filling power allocation. When it happens,the power allocation algorithm applies multilevel water fillingto “expand the convex hull” such that the target rate vector liesin it. As an alternative solution to this issue, one can “break” (ifit is practically allowable) the oversized substream into morethan one substream with smaller rates, or equivalently, lowerSINR requirements. Note that TCD can decompose a MIMOchannel into an arbitrarily large number of subchannels toaccommodate the remedy.

An interesting special case is when , i.e.,the substream shares the same SINR requirements. In this case,

since the singular values are innonincreasing order, and yields a standard water fillingsolution. In this case, as expected, TCD becomes UCD.

B. Numerical Examples

We present three numerical examples to conclude this sec-tion. In the first example, we assume Rayleigh independent flat-fading channels with and . We consider equalQoS requirements for independent substreams. Fig. 3compares the input power needed by our TCD scheme and thelinear transceiver scheme of [16]. Our scheme can save about2.5 dB for any prescribed output SINR.


TABLE IICOMPARISON BETWEEN TCD-VBLAST AND LINEAR TRANSCEIVER SCHEME

Fig. 4. Input SNR versus C . A rank 2 channel is decomposed into two sub-channels with capacities C and C = 10� C .

In the second example, we consider a rank two MIMOchannel with singular values . Suppose we want todecompose the MIMO channel into two subchannels with ca-pacities and with b/s/Hz. We consider thethree scenarios with , and

. As shown in Fig. 4, for all three cases, thereis an inflection point beyond which our TCD is the same as thelinear design of [16]. That is because when the two subchannelshave very disparate QoS constraints, i.e., is far larger than

, the optimal strategy is to apply SVD to the channel matrixand transmit data through the orthogonal eigen-subchannels. (Inthis case, (cf. (11)).) If the subchannel QoS constraintsare not too disparate, which corresponds to the region to theleft of the inflection point, the required input power of our TCDscheme is invariant with respect to and is strictly lessthan that needed by the linear design. This region correspondsto the capacity lossless region (cf. Fig. 1). Another interestingpoint is that the relative advantage of TCD is more prominentif the singular values become more disparate.

In the third example, we consider transmitting three datastreams over a 3 3 Rayleigh independent flat-fading channel.The three data streams have QoS constraints 12, 17, and 18dB, respectively. The data stream with 12-dB output SINRuses quadrature phase-shift keying (QPSK) modulation, andthe other two use 16-QAM. Table II compares the symbol errorrates (SER) of the three data streams using our TCD-VBLASTscheme and the linear scheme proposed in [16]. The results arebased on 100 independent channel realizations. We found thatunder the same QoS constraints, TCD-VBLAST has slightlyhigher SERs than the linear scheme, which is mainly due to the

Fig. 5. Input power required by TCD and linear transceiver to meet the QoSconstraints. 100 Monte Carlo trials.

error propagation effect of the VBLAST equalizer. However,TCD scheme can save about 3.7-dB average input powercompared to the linear scheme. Moreover, Fig. 5 shows thatover the different channel realizations, the input power of TCDhas a much smaller dynamic range than the linear transceiver.Note that a smaller input power variation is advantageous incircuit implementation. It is worth noting that when using theTCD-VBLAST scheme, one needs to detect the data streamswith the smallest SER first to minimize the error propagationtowards the subsequent layers. In this example, we first detectthe stream using QPSK, and then the one with output SINR18 dB. The stream with output SINR 17 dB is detected last.Hence, compared with the linear scheme, our TCD scheme hasan extra constraint on detection ordering. Such a constraint,however, is usually harmless.

The remarkable performance degradation of the linearscheme can be explained in two aspects. First, the linearscheme consists of a linear MMSE (LMMSE) equalizer. Itis well known that the LMMSE equalizer is in general sub-optimal because it ignores the correlation between the linearestimation errors of the substreams. In other words, it does nottake into account the nondiagonality of the error matrix (cf.[16, eq. (10)]). The second, and probably the more importantreason is that the linear scheme tends to put more power inthe weak eigen-subchannels. An illustrative example is asfollows. For a typical realization of 3 3 Rayleigh channelwith singular values T, the powerallocation algorithm of the linear scheme [16, Algorithm 3]yields T. That is, it puts


most power in the weakest eigen-subchannelassociated with the smallest singular value ,which is obviously not efficient. (However, such power al-location is optimal in the framework of linear transceiverdesigns.) In contrast, the TCD scheme employs power loading

T, which, similar to the stan-dard water filling, puts more power to the good channels andless to the bad ones.

V. CDMA SEQUENCE DESIGN

We recognize that the problem of designing optimalS-CDMA sequences, which also has been under intensiveresearch in the past several years, can be regarded as a specialcase of the MIMO transceiver design. In an idealized S-CDMAsystem where the channel does not experience any fading ornear–far effect, mobile users modulate their informationsymbols via spreading sequences , each of which hasthe processing gain . The discrete-time baseband S-CDMAsignal received at the (single-antenna) basestation can be rep-resented as [11]

(37)

where and the th entryof , stands for the information symbol from the th user.In the downlink channel, the basestation multiplexes the infor-mation dedicated to the mobile users through the spreadingsequences, which are the columns of . Then, all the mobiles re-ceive the same signal given in (37). We remark that (37) can alsobe written as (1) with and . Here

is the processing gain. Hence, optimizing the spreading se-quences amounts to optimizing the precoder for a MIMOsystem. Indeed, due to the simple channel matrix , someprocedures of the TCD scheme can be simplified. We shall showthat the TCD scheme turns out to be an improved solution to thesequence design proposed in [14]. At the end of this section, wewill compare our TCD scheme and the scheme proposed in [14].

A. CDMA Sequences Maximizing Sum Capacity

Recall that the precoder maximizing the overall MIMOchannel capacity is T where is obtained bywater filling algorithm. For an S-CDMA channel, , then

and the optimal power loading level is the uniform powerallocation. Hence, the CDMA sequence maximizing the sumcapacity is T. Since has orthonormal columns, we

obtain T . This observation coincides with the findingsin [11], in which the authors show that the CDMA sequences

maximizing the sum capacity are the Welch-Bound-Equalitysequences [11].

B. Uplink Case

For the uplink scenario, i.e., the mobiles to basestation case,the basestation calculates the optimal CDMA sequence for eachmobile user and the corresponding successive nulling vectorneeded by itself. Then the basestation informs the mobile userstheir designated CDMA sequences.

First, we need to calculate the power loading levelssuch that the following GTD matrix decomposition is

possible:

... T (38)

where the diagonal elements of , satisfythe QoS constraints. (Recall the relationship .) Notethat the singular values of form a sequence whose firstelements are , followed byones. From Theorem II.3, (38) exists if and only if

(39)

Similar to (31), we need to solve the problem

(40)

Similar to (33), (40) can be further simplified using the variables

for

and

The simplified problem is (see (41), shown at the bottom of thepage). The algorithm in Fig. 2 is simplified immenselywhen we apply it to (41). Since for all , theconstraints are inactive. Since for all ,the geometric means satisfy for all . Hence, inLemma IV.3, the value of is either 1 or . If , then we set

and we remove from the problem. If , then

(41)


for all . It follows that there exists an index with theproperty that

for all

and

for all

This observation coincides with the solution obtained in [14].Let denote an identity matrix with its first diagonal

elements replaced by . According to the TCDscheme presented in Section III-A, we apply the GTD algorithmto to obtain

T (42)

According to (25)

... (43)

Let

... (44)

By (26) and (27), the nulling vectors used at the basestation are

(45)

where is the th diagonal element of . In summary, thebasestation needs to run the following three steps:

1) solve the optimization problem (41);2) apply the GTD algorithm to in (42);3) obtain the spreading sequences for all mobile users,

, and the nulling vectors (cf.(43) and (45)) for the basestation.

C. Downlink Case

In the downlink case, the mobiles cannot cooperate witheach other for decision feedback. Hence, the VBLAST de-tection is impractical at the receivers. However, we can applyTCD-DP as introduced in Section III-B to cancel out knowninterferences at the transmitter, i.e., the basestation. We canconvert the downlink problem as an uplink one and exploitthe downlink–uplink duality as we have done in Section III-B.Note that , i.e., the downlink and uplink channelsare the same! Consider the case where the uplink and downlinkcommunications are symmetric, i.e., for each mobile user, the

QoS of the communications from the user to the basestationand the basestation to the user are the same. After obtainingthe spreading sequences for the mobile users, andthe nulling vectors used at the basestation forthe uplink case, we immediately know that, in the downlinkcase, the spreading sequences transmitted from the basestationare exactly and the nulling vectors used at themobiles are the spreading sequences, , used inthe uplink case. The only parameters we need to calculateare (cf. (28)). Hence, in this symmetric case, thebasestation only needs to inform the mobiles their designatedspreading sequences once in the two-way communications.Each mobile uses the same sequence for both data transmissionin the uplink channel and interference nulling in the downlinkchannel.

D. Numerical Example

We present one numerical example to show how TCD canbe applied to CDMA sequence design in both the uplink anddownlink case. We consider an example where there aremobile users and the processing gain . The prescribedSINRs of the four users are , and 17 dB, respectively.For the uplink case, we apply the TCD-VBLAST scheme toobtain the spreading sequences of the four users as the columnsof the matrix

(46)

The nulling vectors used by the basestation are the columns ofthe matrix

(47)

We note that for this uplink scenario, the basestation detectsthe fourth mobile user, which has the spreading sequence corre-sponding to the fourth column of , first and the first user last.

If the prescribed SINRs of the four users remain the samein the downlink scenario, the spreading sequences used by thebasestation are the four columns of the matrix

(48)

In this case, the basestation applies the DP precoder to the firstmobile user first and the last user last. Note that the columns of

and in (47) are the same up to four scaling factors, whichare obtained using (28). Moreover, T T


, which means that the power consumed at the bases-tation equals to the overall power used by the four mobile users.At the mobile end, the users use the nulling vectors

(49)

which are exactly the spreading sequences used in the uplinkscenario. Scaling the output signals may be necessary at the mo-bile ends for the subsequent DP decoder. But the signal scalingdoes not influence the output SINR.

If zeros are not allowed in the spreading sequences, we canleft multiply and a 3 3 orthogonal matrix to eliminatethe zeros in .

E. Further Remarks

The TCD scheme, which was originally motivated by MIMOtransceiver designs, turns out to be similar to the scheme of[14] in several aspects. Both schemes are based on the non-linear decision feedback operations. Hence, both are optimal interms of maximizing the channel throughput and minimizingthe overall input power. Both the GTD algorithm, on whichthe TCD scheme is based, and the construction of the Hermi-tian matrix with prescribed eigenvalues and Cholesky valuesas done in [14] rely on the Weyl–Horn theorem. However, ourTCD scheme enjoys several remarkable advantages over thescheme of [14]. First, note that if we obtain the GTD

, where has the prescribed diagonal elements, thenit follows immediately that is thedesired Cholesky decomposition. However, the information as-sociated with is lost in the Cholesky decomposition. Hence,the nulling vectors used at the receivers of [14] cannot be cal-culated explicitly as our TCD does (cf. (27)). Furthermore, thecorrelation matrix is only an intermediate result. To get theCDMA sequences, one has to decompose explicitly.The TCD scheme, however, can be used to obtain both the pre-coder (CDMA sequences), which are the columns of , and theequalizer from simultaneously. Second, our TCD scheme is asolution to the more general MIMO transceiver design problem.The Cholesky decomposition algorithm provided in AppendixC of [14] is only applicable to the scenario where the singularvalues are only of two values. Hence, it is not applicable tothe general design of MIMO transceivers. The more generalCholesky factorization algorithm suggested in the proofs of [14]is computationally far less efficient. Third, the TCD scheme hastwo implementation forms, i.e., TCD-VBLAST and TCD-DP,which makes it applicable to both the downlink and uplink sce-narios. Finally, the TCD scheme provides the insights linkingthe CDMA sequence design problem as a special case of theMIMO transceiver design problems.

VI. CONCLUSION

Based on the recently developed GTD matrix decomposi-tion algorithm, we have proposed the TCD scheme utilizingthe CSIT and CSIR. TCD can be used to decompose a MIMOchannel into multiple subchannels with prescribed capacities.

The TCD scheme has two implementation forms. One is thecombination of a linear precoder and a MMSE-VBLAST de-tector, which is referred to as TCD-VBLAST, and the other in-cludes a DP precoder and a linear equalizer followed by a DP de-coder, which we refer to as TCD-DP. We have also determinedthe subchannel capacity region such that a capacity lossless de-composition is possible. The applications of the TCD schemefor MIMO communications with QoS constraints have been in-vestigated. We have also identified the problem of designingCDMA sequences as special cases in the unifying frameworkof MIMO transceiver designs. In particular, we have shown thatthe CDMA sequence design problem in the uplink and downlinkscenarios can be solved using TCD-VBLAST and TCD-DP, re-spectively.

APPENDIX APROOF OF THEOREM IV.1

Observe that for , we have

(50)

Hence, for , and

Since , the last inequalities in the multi-plicative majorization condition in (30) are implied by the singleequality constraint in (31). Hence, the problem (30) reduces to(31) where , which gives an upper bound for theminimum in (30).

Let T denote the singular value decompo-sition for any given . Once again, isgiven by the sum in (50). By [23, Theorem 3.3.14], the singularvalues of the product of two matrices are multiplicativelymajorized by the product of the singular values of and

(51)

Taking log’s, we have

(52)

By [23, Lemma 3.3.8] and (52)

(53)

whenever is a real-valued, increasing convex function. Thefunction is convex since its second derivative


is positive. Making this choice for in (53) and exponentiatingboth sides, we obtain

(54)

Since is feasible in (30)

Combining this with (54), we get

(55)

Since is the square of the th singular value of theaugmented matrix corresponding to the choice , weconclude that satisfies all the inequality constraintsin (30). If the inequality (55) is strict, then should be de-creased in order to satisfy the equality constraint in (31). Sincedecreasing only lowers , we deduce that the min-imum in (30) is achieved by a matrix of the form .If is optimal in (30), then so is T

whenever has orthonormal columns (since the constraintsare satisfied and the value of the cost does not change). Wenow make the choice for given in Theorem III.1. That is, if

T is the GTD of in (19) where is a solutionof (31), then is the matrix formed by the first columns of

T. For this choice of , the constraints of (29) are satisfied.As noted earlier, the minimum in (29) can be no smaller thanthe minimum in (30). Since this choice for yields the samecost in both (29) and (30), we conclude that T isoptimal in (29).

APPENDIX BPROOF OF LEMMA IV.2

If is a solution to (33) with the property thatfor all , then by the convexity of the constraints, it fol-lows that is a solution of (34). By Lemma IV.1, we concludethat Lemma IV.2 holds with . Now, suppose that is asolution of (33) with for some . We wish to showthat for all . Suppose, to the contrary, thatthere exists an index with the property thatand . We show that components andof can be modified so as to satisfy the constraints and make

the cost strictly smaller. In particular, let be identical withexcept for components and

and (56)

For a small satisfies the constraints of (33). Thechange in the cost function of (33) is

The derivative of evaluated at zero is

Since is an increasing function of and since, we conclude that and . Hence,

for near zero, has a smaller cost than , whichyields a contradiction. Hence, there exists an index with theproperty that for all and for all

.According to Lemma IV.1, for any . To

complete the proof, we need to show that . As notedpreviously, any solution of (33) satisfies

which implies (cf. (32))

That is, the constraint in (33) is inactive. If, we will decrease the th component and increase

the component, while leaving the other components un-changed. Letting be the modified vector, we set

and

Since the th constraint in (33) is inactive, is feasible fornear zero. And if , then the cost decreases as

increases. It follows that .

APPENDIX CPROOF OF LEMMA IV.3

First suppose that . By the arithmetic/geometricmean inequality, the problem

subject to (57)


has the solution . Since is a decreasingfunction of and , we conclude that satis-fies the constraints for . Since attainsthe maximum in (36)

for all . Hence, by taking for , the firstinequalities in (33) are satisfied, with equality for , and

the first lower bound constraints are satisfied.Let denote any optimal solution of (33). If

(58)

then by the unique optimality of , in (57), andby the fact that the inequality constraints in (33) are satisfied for

, we conclude that for all . On theother hand, suppose that

(59)

We show below that this leads to a contradiction; consequently,(58) holds and for .

Define the quantity

By (59), . Again, by the arithmetic/geometric meaninequality, the solution of the problem

subject to (60)

is for . By (59), and satisfies theinequality constraints in (33) for .

Let be the smallest index with the property that

(61)

Such an index exists since is optimal, which implies that

First, suppose that , where is the break point given inLemma IV.2. By complementary slackness, and

for . We conclude that for. By (61), we have

It follows that

which contradicts the fact that achieves the maximum in (36).In the case , we have for . Again,

this follows by complementary slackness. However, we needto stop when since the lower bound constraints becomeactive for . In Lemma IV.2, we show thatfor . Consequently, we have

Again, this contradicts the fact that achieves the maximum in(36). This completes the analysis of the case where .

Now consider the case . By the definition of ,we have

or (62)

If is the break point described in Lemma IV.2, thenfor all ; it follows that

(63)

Since the product of the components of is equal to theproduct of the components of , from (62) and (63) we get

Hence, for all . In particular,if , then , or, when . As aconsequence, .

ACKNOWLEDGMENT

The authors are grateful to the anonymous reviewers for theirhelpful suggestions for improving the submitted manuscript.

REFERENCES

[1] G. G. Raleigh and J. M. Cioffi, “Spatial-temporal coding for wirelesscommunication,” IEEE Trans. Commun., vol. 46, pp. 357–366, Mar.1998.


[2] A. Scaglione, G. B. Giannakis, and S. Barbarossa, “Filterbank trans-ceiver optimizing information rate in block transmissions over disper-sive channels,” IEEE Trans. Inf. Theory, vol. 45, pp. 1019–1032, Apr.1999.

[3] D. Palomar, J. Cioffi, and M. Lagunas, “Joint Tx–Rx beamformingdesign for multicarrier MIMO channels: A unified framework forconvex optimization,” IEEE Trans. Signal Process., vol. 51, no. 9, pp.2381–2401, Sep. 2003.

[4] Y. Jiang, J. Li, and W. Hager, “Joint transceiver design for MIMOcommunications using geometric mean decomposition,” IEEE Trans.Signal Process., vol. 53, no. 10, pp. 3791–3803, Oct. 2005.

[5] Y. Jiang, J. Li, and W. Hager, “Uniform channel decomposition forMIMO communications,” IEEE Trans. Signal Process., vol. 53, no. 11,pp. 4283–4294, Nov. 2005.

[6] J.-K. Zhang, A. Kavcic, and K. M. Wong, “Equal-diagonal QR decom-position and its application to precoder design for successive-cancel-lation detection,” IEEE Trans. Inf. Theory, vol. 51, pp. 154–172, Jan.2005.

[7] S. Kung, Y. Wu, and X. Zhang, “Bezout space–time precoders andequalizers for MIMO channels,” IEEE Trans. Signal Process., vol. 50,no. 10, pp. 2499–2514, Oct. 2002.

[8] A. Hjørungnes, P. Diniz, and M. de Campos, “Jointly minimum BERtransmitter and receiver FIR MIMO filters for binary signal vectors,”IEEE Trans. Signal Process., vol. 52, no. 4, pp. 1021–1036, Apr. 2004.

[9] F. Xu, T. Davidson, J. K. Zhang, and S. Chan, “Design of block trans-ceivers with MMSE decision feedback detection,” in Proc. IEEE Int.Conf. Acoustics Speech, Signal Processing, Mar. 2005, vol. 3, pp. iii/1109–iii/1112.

[10] Y. Jiang, W. Hager, and J. Li, “The Generalized Triangular De-composition,” [Online]. Available: http://www.sal.ufl.edu/yjiang/pa-pers/gtd.pdf, submitted for publication

[11] M. Rupf and J. L. Massey, “Optimal sequence multisets for syn-chronous code-division multiple-access channels,” IEEE Trans. Inf.Theory, vol. 40, pp. 1261–1266, Jul. 1994.

[12] P. Viswanath and V. Anantharam, “Optimal sequences and sum ca-pacity of synchronous CDMA systems,” IEEE Trans. Inf. Theory, vol.45, pp. 1984–1991, Sep. 1999.

[13] P. Viswahath, V. Anantharam, and D. Tse, “Optimal sequences, powercontrol and user capacity of synchronous CDMA systems with linearMMSE multiuser receivers,” IEEE Trans. Inf. Theory, vol. 45, pp.1968–1983, Sep. 1999.

[14] T. Guess, “Optimal sequence for CDMA with decision-feedback re-ceivers,” IEEE Trans. Inf. Theory, vol. 49, pp. 886–900, Apr. 2003.

[15] H. Sampath, P. Stoica, and A. Paulraj, “Generalized linear precoderand decoder design for MIMO channels using the weighted MMSEcriterion,” IEEE Trans. Commun., vol. 49, pp. 2198–2206, Dec. 2001.

[16] D. Palomar, M. Lagunas, and J. Cioffi, “Optimum linear jointtransmit–receive processing for MIMO channels with QoS con-straints,” IEEE Trans. Signal Process., vol. 52, no. 5, pp. 1179–1197,May 2004.

[17] A. Marshall and I. Olkin, Inequalities: Theory of Majorization. NewYork: Academic, 1979.

[18] G. J. Foschini, G. D. Golden, R. A. Valenzuela, and P. W. Wolniansky,“Simplified processing for high spectral efficiency wireless com-munication employing multiple-element arrays,” Wireless PersonalCommun., vol. 6, pp. 311–335, Mar. 1999.

[19] B. Hassibi, “A fast square-root implementation for BLAST,” Proc.34th Asilomar Conf. Signals, Systems, Computers, pp. 1255–1259,Nov. 2000.

[20] B. Hassibi, “An efficient square-root implementation for BLAST2000 [Online]. Available: http://cm.bell-labs.com/who/hochwald/pa-pers/squareroot/

[21] I. E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur.Trans. Telecommun., vol. 10, no. 6, pp. 585–595, 1999.

[22] P. Viswanath and D. Tse, “Sum capacity of the vector Gaussian broad-cast channel and uplink–downlink duality,” IEEE Trans. Inf. Theory,vol. 49, pp. 1912–1921, Aug. 2003.

[23] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis. Cambridge,U.K.: Cambridge Univ. Press, 1991.

Yi Jiang (S’02–M’06) received the B.Sc. degree inelectrical engineering and information science fromthe University of Science and Technology of China(USTC), Hefei, China, in 2001, and the M.Sc. andPh.D. degrees, both in electrical engineering, fromthe University of Florida, Gainesville, in 2003 and2005, respectively.

In summer 2005, he was a Research Consultantwith Information Science Technologies, Inc. (ISTI),Fort Collins, CO. He is now a Postdoctoral Re-searcher with the University of Colorado, Boulder.

His research interests are in the areas of signal processing, wireless communi-cations, and information theory.

William W. Hager received the B.S. degree in math-ematics from Harvey Mudd College, Claremont, CA,in 1970 and the Master’s and Ph.D. degrees from theMassachusetts Institute of Technology, Cambridge,in 1971 and 1974, respectively. He was awarded aNational Science Foundation Fellowship for graduatestudies in mathematics.

He was an Assistant Professor at the University ofSouth Florida, Tampa, from 1974 to 1976, an Assis-tant Professor at Carnegie Mellon University, Pitts-burgh, PA, from 1976 to 1980, and an Associate Pro-

fessor from 1980 to 1986 and a Professor from 1986 to 1988 at PennsylvaniaState University, University Park. Since 1988, he has been a Professor at theUniversity of Florida, Gainesville, and since 1992, he has been Co-Director ofthe Center for Applied Optimization at the University of Florida .

Dr. Hager was Co-Chair of the 2001 SIAM Conference on Control and ItsApplications. A paper coauthored with his Ph.D. student H. Zhang received the2006 SIAM Best Student Paper Prize. He was Program Director of the SIAMActivity Group on Control and System Theory from 1998 to 2001, and he hasbeen Editor-in-Chief of Computational Optimization and Applications since1992.

Jian Li (S’87–M’91–SM’97–F’05) received theM.Sc. and Ph.D. degrees in electrical engineeringfrom The Ohio State University, Columbus, in 1987and 1991, respectively.

From April 1991 to June 1991, she was an AdjunctAssistant Professor with the Department of ElectricalEngineering, The Ohio State University, Columbus.From July 1991 to June 1993, she was an AssistantProfessor with the Department of Electrical Engi-neering, University of Kentucky, Lexington. SinceAugust 1993, she has been with the Department of

Electrical and Computer Engineering, University of Florida, Gainesville, whereshe is currently a Professor. Her current research interests include spectralestimation, statistical and array signal processing, and their applications.

Dr. Li is a Fellow of the Institution for Electrical Engineers (IEE) and amember of Sigma Xi and Phi Kappa Phi. She received the 1994 National Sci-ence Foundation Young Investigator Award and the 1996 Office of Naval Re-search Young Investigator Award. She was an Executive Committee Memberof the 2002 International Conference on Acoustics, Speech, and Signal Pro-cessing, Orlando, FL, May 2002. She was an Associate Editor of the IEEETRANSACTIONS ON SIGNAL PROCESSING from 1999 to 2005 and an AssociateEditor of the IEEE Signal Processing Magazine from 2003 to 2005. She hasbeen a member of the Editorial Board of Signal Processing, a publication ofthe European Association for Signal Processing (EURASIP), since 2005. She ispresently a member of two of the IEEE Signal Processing Society technical com-mittees: the Signal Processing Theory and Methods (SPTM) Technical Com-mittee and the Sensor Array and Multichannel (SAM) Technical Committee.She is a coauthor of a paper on multistatic adaptive microwave imaging forearly breast cancer detection that has received the Best Student Paper Awardat the 2005 Annual Asilomar Conference on Signals, Systems, and Computersin Pacific Grove, CA.

ieee transactions on signal processing, vol. …users.clas.ufl.edu/hager/papers/gtd/tcd.pdf · and...

Documents