2640 ieee transactions on wireless...

12
2640 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011 Pilot Contamination and Precoding in Multi-Cell TDD Systems Jubin Jose, Student Member, IEEE, Alexei Ashikhmin, Senior Member, IEEE, Thomas L. Marzetta, Fellow, IEEE, and Sriram Vishwanath, Senior Member, IEEE Abstract—This paper considers a multi-cell multiple antenna system with precoding used at the base stations for downlink transmission. Channel state information (CSI) is essential for precoding at the base stations. An effective technique for ob- taining this CSI is time-division duplex (TDD) operation where uplink training in conjunction with reciprocity simultaneously provides the base stations with downlink as well as uplink channel estimates. This paper mathematically characterizes the impact that uplink training has on the performance of such multi- cell multiple antenna systems. When non-orthogonal training sequences are used for uplink training, the paper shows that the precoding matrix used by the base station in one cell becomes corrupted by the channel between that base station and the users in other cells in an undesirable manner. This paper analyzes this fundamental problem of pilot contamination in multi-cell systems. Furthermore, it develops a new multi-cell MMSE-based precoding method that mitigates this problem. In addition to being linear, this precoding method has a simple closed-form expression that results from an intuitive optimization. Numerical results show signicant performance gains compared to certain popular single-cell precoding methods. Index Terms—Time-division duplex systems, uplink training, pilot contamination, MMSE precoding. I. I NTRODUCTION M ULTIPLE antennas, especially at the base-station, have now become an accepted and a central feature of cellular networks. These networks have been studied exten- sively over the past one and a half decades (see [2] and references therein). It is now well understood that channel state information (CSI) at the base station is an essential component when trying to maximize network throughput. Systems with varying degrees of CSI have been studied in great detail in literature. The primary framework under which these have been studied is frequency division duplex (FDD) systems, where the CSI is typically obtained through (limited) feedback. There is a rich body of work in jointly designing this feedback mechanism with (pre)coding strategies to maximize Manuscript received June 29, 2010; revised January 27, 2011; accepted March 10, 2011. The associate editor coordinating the review of this paper and approving it for publication was A. Hjørungnes. Results in this paper were presented in part at the IEEE International Symposium on Information Theory (ISIT) [1]. J. Jose and S. Vishwanath are with the Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX 78712 USA (e-mail: {jubin, sriram}@austin.utexas.edu). The authors were supported in part by the DoD and the ARO Young Investigator Program (YIP). A. Ashikhmin and T. L. Marzetta are with Bell Laboratories, Alcatel-Lucent Inc., Murray Hill, NJ 07974 USA (e-mail: [email protected]; [email protected]). Digital Object Identier 10.1109/TWC.2011.060711.101155 throughput in MIMO downlink [3]–[8]. Time division duplex (TDD) systems, however, have a fundamentally different ar- chitecture from the ones studied in FDD systems [9], [10]. The goal of this paper is to develop a clear understanding of mechanisms for acquiring CSI and subsequently designing precoding strategies for multi-cell MIMO TDD systems. Although most current cellular systems are frequency- division duplex (FDD), there is a compelling case to be made for time-division duplex (TDD). Arguably the central problem in advanced wireless systems is the acquisition of CSI in a timely manner. A major point of this paper is the desirability of having a large excess of base station antennas compared with terminals. For fast-changing channels TDD offers the only way to acquire timely CSI since the training burden for reverse- link pilots in a TDD system is independent of the number of base station antennas, while conversely the training burden for forward-link pilots in an FDD system is proportional to the number of antennas. Use of FDD would forever impose a severe limit on the number of base station antennas. An important distinguishing feature of TDD systems is the notion of reciprocity, where the reverse channel is used as an estimate of the forward channel. While utilizing reciprocity, the differences in the transfer characteristics of the ampliers and the lters in the two directions must be accounted for. This can be handled by measuring the different scalar gains in the two directions. Arguably, this reverse channel estimation one of the best advantages of a TDD architecture, as it eliminates the need for feedback, and uplink training together with the reciprocity of the wireless medium [11], [12] is sufcient to provide us with the desired CSI. In [13], channel reciprocity has been validated through experiments. However, as we see next, this channel estimate is not without issues that must be addressed before it proves useful. In this paper, we consider uplink training and transmit precoding in a multi-cell scenario with cells, where each cell consists of a base station with antennas and users with single antenna each. The impact of uplink training on the resulting channel estimate (and thus system performance) in the multi-cell scenario is signicantly different from that in a single-cell scenario. In the multi-cell scenario, non-orthogonal training sequences (pilots) must be utilized, as orthogonal pilots would need to be least × symbols long which is infeasible for large . In particular, short channel coherence times due to mobility do not allow for such long training sequences. This non-orthogonal nature causes pilot contamination [1], 1536-1276/11$25.00 c 2011 IEEE

Upload: phamthuy

Post on 28-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2640 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

Pilot Contamination and Precoding inMulti-Cell TDD Systems

Jubin Jose, Student Member, IEEE, Alexei Ashikhmin, Senior Member, IEEE,Thomas L. Marzetta, Fellow, IEEE, and Sriram Vishwanath, Senior Member, IEEE

Abstract—This paper considers a multi-cell multiple antennasystem with precoding used at the base stations for downlinktransmission. Channel state information (CSI) is essential forprecoding at the base stations. An effective technique for ob-taining this CSI is time-division duplex (TDD) operation whereuplink training in conjunction with reciprocity simultaneouslyprovides the base stations with downlink as well as uplink channelestimates. This paper mathematically characterizes the impactthat uplink training has on the performance of such multi-cell multiple antenna systems. When non-orthogonal trainingsequences are used for uplink training, the paper shows that theprecoding matrix used by the base station in one cell becomescorrupted by the channel between that base station and the usersin other cells in an undesirable manner. This paper analyzesthis fundamental problem of pilot contamination in multi-cellsystems. Furthermore, it develops a new multi-cell MMSE-basedprecoding method that mitigates this problem. In addition tobeing linear, this precoding method has a simple closed-formexpression that results from an intuitive optimization. Numericalresults show significant performance gains compared to certainpopular single-cell precoding methods.

Index Terms—Time-division duplex systems, uplink training,pilot contamination, MMSE precoding.

I. INTRODUCTION

MULTIPLE antennas, especially at the base-station, havenow become an accepted and a central feature of

cellular networks. These networks have been studied exten-sively over the past one and a half decades (see [2] andreferences therein). It is now well understood that channelstate information (CSI) at the base station is an essentialcomponent when trying to maximize network throughput.Systems with varying degrees of CSI have been studied ingreat detail in literature. The primary framework under whichthese have been studied is frequency division duplex (FDD)systems, where the CSI is typically obtained through (limited)feedback. There is a rich body of work in jointly designing thisfeedback mechanism with (pre)coding strategies to maximize

Manuscript received June 29, 2010; revised January 27, 2011; acceptedMarch 10, 2011. The associate editor coordinating the review of this paperand approving it for publication was A. Hjørungnes.

Results in this paper were presented in part at the IEEE InternationalSymposium on Information Theory (ISIT) [1].

J. Jose and S. Vishwanath are with the Department of Electrical andComputer Engineering, The University of Texas at Austin, Austin, TX78712 USA (e-mail: {jubin, sriram}@austin.utexas.edu). The authors weresupported in part by the DoD and the ARO Young Investigator Program(YIP).

A. Ashikhmin and T. L. Marzetta are with Bell Laboratories, Alcatel-LucentInc., Murray Hill, NJ 07974 USA (e-mail: [email protected];[email protected]).

Digital Object Identifier 10.1109/TWC.2011.060711.101155

throughput in MIMO downlink [3]–[8]. Time division duplex(TDD) systems, however, have a fundamentally different ar-chitecture from the ones studied in FDD systems [9], [10].The goal of this paper is to develop a clear understandingof mechanisms for acquiring CSI and subsequently designingprecoding strategies for multi-cell MIMO TDD systems.

Although most current cellular systems are frequency-division duplex (FDD), there is a compelling case to be madefor time-division duplex (TDD). Arguably the central problemin advanced wireless systems is the acquisition of CSI in atimely manner. A major point of this paper is the desirability ofhaving a large excess of base station antennas compared withterminals. For fast-changing channels TDD offers the only wayto acquire timely CSI since the training burden for reverse-link pilots in a TDD system is independent of the numberof base station antennas, while conversely the training burdenfor forward-link pilots in an FDD system is proportional tothe number of antennas. Use of FDD would forever impose asevere limit on the number of base station antennas.

An important distinguishing feature of TDD systems is thenotion of reciprocity, where the reverse channel is used as anestimate of the forward channel. While utilizing reciprocity,the differences in the transfer characteristics of the amplifiersand the filters in the two directions must be accounted for. Thiscan be handled by measuring the different scalar gains in thetwo directions. Arguably, this reverse channel estimation oneof the best advantages of a TDD architecture, as it eliminatesthe need for feedback, and uplink training together with thereciprocity of the wireless medium [11], [12] is sufficient toprovide us with the desired CSI. In [13], channel reciprocityhas been validated through experiments. However, as we seenext, this channel estimate is not without issues that must beaddressed before it proves useful.

In this paper, we consider uplink training and transmitprecoding in a multi-cell scenario with 𝐿 cells, where eachcell consists of a base station with 𝑀 antennas and 𝐾 userswith single antenna each. The impact of uplink training on theresulting channel estimate (and thus system performance) inthe multi-cell scenario is significantly different from that in asingle-cell scenario. In the multi-cell scenario, non-orthogonaltraining sequences (pilots) must be utilized, as orthogonalpilots would need to be least 𝐾 × 𝐿 symbols long which isinfeasible for large 𝐿. In particular, short channel coherencetimes due to mobility do not allow for such long trainingsequences.

This non-orthogonal nature causes pilot contamination [1],

1536-1276/11$25.00 c⃝ 2011 IEEE

Page 2: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2641

h12

BS-2BS-1

User-2User-1

h22

Cell-1 Cell-2

Fig. 1. A two-cell example with one user in each cell. Both userstransmit non-orthogonal pilots during uplink training, which leads to pilotcontamination at both the base stations.

[14], which is encountered only when analyzing a multi-cell MIMO system with training, and is lost when narrowingfocus to a single-cell setting or to a multi-cell setting wherechannel information is assumed available at no cost. In thispaper, we perform a more detailed study of this problemand consider precoding in its presence. First, we study theimpact of pilot contamination (and thus achievable rates), andthen, we develop methods that mitigate this contamination. Inolder generation cellular systems, multiple factors includinglarge reuse distance of any training signal and randomizationin the selection of pilots would have helped in keepingthe impact of pilot contamination reasonable. However, withnewer generation systems designed for more aggressive reuseof spectrum, this impact is very crucial to understand andmitigate. We note that pilot contamination must also figure inCooperative MIMO (also called Network MIMO [15], [16])where clusters of base stations are wired together to createdistributed arrays, and where pilots must be re-used overmultiple clusters.

The fundamental problem associated with pilot contamina-tion is evident even in the simple multi-cell scenario shown inFigure 1. Consider two cells 𝑖 ∈ {1, 2}, each consisting of onebase station and one user. Let h𝑖𝑗 denote the channel betweenthe base station in the 𝑖-th cell and the user in the 𝑗-th cell. Letthe training sequences used by both the users be same. In thiscase, the MMSE channel estimate of h22 at the base stationin the 2-nd cell is h22 = 𝑐1h12+𝑐2h22+𝑐w. Here 𝑐1, 𝑐2 and𝑐 are constants that depend on the propagation factors and thetransmit powers of mobiles, and w is 𝒞𝒩 (0, I) additive noise.The base station in the 2-nd cell uses this channel estimateto form a precoding vector a2 = 𝑓(h22), which is usuallyaligned with the channel estimate, that is a2 = 𝑐𝑜𝑛𝑠𝑡 ⋅ h22.However, by doing this, the base station (partially) aligns thetransmitted signal with both h22 (which is desirable) and h12

(which is undesirable). Both signal (h22a†2) and interference

(h12a†2) statistically behave similarly. Therefore, the general

assumption that the precoding vector used by a base stationin one cell is uncorrelated with the channel to users in othercells is not valid with uplink training using non-orthogonaltraining sequences. This fundamental problem is studied infurther detail.

To perform pilot contamination analysis, we first developanalytical expressions using techniques similar to those used

in [9], [10]. For the setting with one user in every cell, wederive closed-form expressions for achievable rates. Theseclosed-form expressions allow us to determine the extent towhich pilot contamination impacts system performance. Inparticular, we show that the achievable rates can saturate withthe number of antennas at the base station 𝑀 . This analysiswill allow system designers to determine the appropriate fre-quency/time/pilot reuse factor to maximize system throughputin the presence of pilot contamination.

In the multi-cell scenario, there has been significant workon utilizing coordination among base stations [15]–[18] whenCSI is available. This existing body of work focuses onthe gain that can be obtained through coordination of thebase stations. Dirty paper coding based approaches and jointbeamforming/precoding approaches are considered in [18].Linear precoding methods for clustered networks with fullintra-cluster coordination and limited inter-cluster coordina-tion are proposed in [19]. These approaches generally require“good” channel estimates at the base stations. Due to non-orthogonal training sequences, the resulting channel estimate(of the channel between a base station and all users) can beshown to be rank deficient. We develop a multi-cell MMSE-based precoding method that depends on the set of trainingsequences assigned to the users. Note that this MMSE-basedprecoding is for the general setting with multiple users in everycell. Our approach does not need coordination between basestations required by the joint precoding techniques1. Whencoordination is present, this approach can be applied at theinter-cluster level. The MMSE-based precoding derived in thispaper has several advantages. In addition to being a linearprecoding method, it has a simple closed-form expression thatresults from an intuitive optimization problem formulation. Formany training sequence allocations, numerical results showthat our approach gives significant gains over certain popularsingle-cell precoding methods including zero-forcing.

A. Related Work

Over the past decade, a variety of aspects of downlinkand uplink transmission problems in a single cell settinghave been studied. In information theoretic literature, theseproblems are studied as the broadcast channel (BC) and themultiple access channel (MAC) respectively. For Gaussian BCand general MAC, the problems have been studied for bothsingle and multiple antenna cases. The sum capacity of themulti-antenna Gaussian BC has been shown to be achieved bydirty paper coding (DPC) in [20]–[23]. It was shown in [24]that DPC characterizes the full capacity region of the multi-antenna Gaussian BC. These results assume perfect CSI at thebase station and the users. In addition, the DPC technique iscomputationally challenging to implement in practice. Therehas been significant research focus on reducing the compu-tational complexity at the base station and the users. In thisregard, different precoding schemes with low complexity havebeen proposed. This body of work [25]–[29] demonstratessthat sum rates close to sum capacity can be achieved withmuch lower computational complexity. However, these resultsassume perfect CSI at the base station and the users.

1There is no exchange of channel state information among base stations.

Page 3: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2642 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

The problem of lack of channel CSI is usually studied byconsidering one of the following two settings. As discussedbefore, in the first setting, CSI at users is assumed to beavailable and a limited feedback link is assumed to exist fromthe users to the base station. In [3], [5]–[8], [30] such asetting is considered. In [5], the authors show that at highsignal to noise ratios (SNRs), the feedback rate required peruser must grow linearly with the SNR (in dB) in order toobtain the full MIMO BC multiplexing gain. The main resultin [6] is that the extent of CSI feedback can be reducedby exploiting multi-user diversity. In [7] it is shown thatnonrandom vector quantizers can significantly increase theMIMO downlink throughput. In [8], the authors design a jointCSI quantization, beamforming and scheduling algorithm toattain optimal throughput scaling. In the next setting, timedivision duplex systems are considered and channel trainingand estimation error are accounted for in the net achievablerate. This approach is used in [9], [10], [31], [32]. In [9], theauthors give a lower bound on sum capacity and demonstratethat it is always beneficial to increase the number of antennasat the base station. In [10], the authors study a heterogeneoususer setting and present scheduling and precoding methods forthis setting. In [31], the authors consider two-way training andpropose two variants of linear MMSE precoders as alternativesto linear zero-forcing precoder used in [9]. Single-cell analysisof TDD systems are also provided in [32].

Given this extensive body of literature in single-cell sys-tems, the main contribution of this paper is in understandingmulti-cell systems with channel training. Its emphasis is onTDD systems, which are arguably poorly studied comparedto FDD systems. Specifically, the main contributions are todemonstrate the pilot contamination problem associated withuplink training, understand its impact on the operation ofmulti-cell MIMO TDD cellular systems, and develop a newprecoding method to mitigate this problem.

B. Notation

We use bold font variables to denote matrices and vectors.(⋅)𝑇 denotes the transpose and (⋅)† denotes the Hermitiantranspose, tr{⋅} denotes the trace operation, (⋅)−1 denotes theinverse operation, and ∥ ⋅ ∥ denotes the two-norm. diag{a}denotes a diagonal matrix with diagonal entries equal to thecomponents of a. 𝔼[⋅] and var{⋅} stand for expectation andvariance operations, respectively.

C. Organization

The rest of this paper is organized as follows. In Section II,we describe the multi-cell system model. In Section III, weexplain the communication scheme and the technique to obtainachievable rates. We analyze the effect of pilot contaminationin Section IV, and give the details of the new precodingmethod in Section V. We present numerical results in SectionVI. Finally, we provide our concluding remarks in SectionVII.

II. MULTI-CELL TDD SYSTEM MODEL

We consider a cellular system with 𝐿 cells numbered1, 2, . . . , 𝐿. Each cell consists of one base station with 𝑀

√𝛽𝑗𝑙𝑘ℎ𝑗𝑙𝑘𝑚

Cell-𝑙

User-𝑘

Cell-𝑗

𝑚BS

Power 𝜌𝑓 Power 𝜌𝑟

Fig. 2. System model showing the base station in 𝑙-th cell and the 𝑘-th userin 𝑗-th cell.

antennas and 𝐾(≤𝑀) single-antenna users2. Let the averagepower (during transmission) at the base station be 𝑝𝑓 and theaverage power (during transmission) at each user be 𝑝𝑟. Thepropagation factor between the 𝑚-th base station antenna ofthe 𝑙-th cell and the 𝑘-th user of the 𝑗-th cell is

√𝛽𝑗𝑙𝑘ℎ𝑗𝑙𝑘𝑚,

where {𝛽𝑗𝑙𝑘} are non-negative constants and assumed tobe known to everybody, and {ℎ𝑗𝑙𝑘𝑚} are independent andidentically distributed (i.i.d.) zero-mean, circularly-symmetriccomplex Gaussian 𝒞𝒩 (0, 1) random variables and knownto nobody3. This system model is shown in Figure 2. Theabove assumptions are fairly accurate and justified due tothe following reason. The {𝛽𝑗𝑙𝑘} values model path-loss andshadowing that change slowly and can be learned over longperiod of time, while the {ℎ𝑗𝑙𝑘𝑚} values model fading thatchange relatively fast and must be learned and used veryquickly. Since the cell layout and shadowing are capturedusing the constant {𝛽𝑗𝑙𝑘} values, for the purpose of this paper,the specific details of the cell layout and shadowing model areirrelevant. In other words, any cell layout and any shadowingmodel can be incorporated with the above abstraction.

We assume channel reciprocity for the forward and reverselinks, i.e., the propagation factor

√𝛽𝑗𝑙𝑘ℎ𝑗𝑙𝑘𝑚 is same for both

forward and reverse links, and block fading, i.e., {ℎ𝑗𝑙𝑘𝑚}remains constant for a duration of 𝑇 symbols. Note that weallow for a constant factor variation in forward and reversepropagation factors through the different average power con-straints at the base stations and the users. The additive noises atall terminals are i.i.d. 𝒞𝒩 (0, 1) random variables. The systemequations describing the signals received at the base stationand the users are given in the next section.

III. COMMUNICATION SCHEME

The communication scheme consists of two phases: uplinktraining and data transmission. Uplink training phase consistsof users transmitting training pilots, and base stations obtain-ing channel estimates. Data transmission phase consists ofbase stations transmitting data to the users through transmitprecoding. Next, we describe these phases briefly and providea set of achievable data rates using a given precoding method.

2These are the users of interest for applying the precoding framework.3For compact notation, we do not separate the subscript or superscript

indices using commas throughout the paper.

Page 4: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2643

A. Uplink Training

At the beginning of every coherence interval, all users (in allcells) transmit training sequences, which are 𝜏 length columnvectors4. Let

√𝜏𝝍𝑗𝑘 (normalized such that 𝝍†

𝑗𝑘𝝍𝑗𝑘 = 1) bethe training vector transmitted by the 𝑘-th user in the 𝑗-th cell.Consider the base station of the 𝑙-th cell. The 𝜏 length columnvector received at the 𝑚-th antenna of this base station is

y𝑙𝑚 =𝐿∑

𝑗=1

𝐾∑𝑘=1

√𝑝𝑟𝜏𝛽𝑗𝑙𝑘ℎ𝑗𝑙𝑘𝑚𝝍𝑗𝑘 +w𝑙𝑚, (1)

where w𝑙𝑚 is the additive noise. Let Y𝑙 = [y𝑙1 y𝑙2 . . . y𝑙𝑀 ](𝜏 × 𝑀 matrix), W𝑙 = [w𝑙1 w𝑙2 . . . w𝑙𝑀 ] (𝜏 × 𝑀 ma-trix), Ψ𝑗 = [𝝍𝑗1 𝝍𝑗2 . . . 𝝍𝑗𝐾 ] (𝜏 × 𝐾 matrix), D𝑗𝑙 =diag{[𝛽𝑗𝑙1 𝛽𝑗𝑙2 . . . 𝛽𝑗𝑙𝐾 ]}, and

H𝑗𝑙 =

⎡⎢⎣ℎ𝑗𝑙11 . . . ℎ𝑗𝑙1𝑀

.... . .

...ℎ𝑗𝑙𝐾1 . . . ℎ𝑗𝑙𝐾𝑀

⎤⎥⎦ .

From (1), the signal received at this base station can beexpressed as

Y𝑙 =√𝑝𝑟𝜏

𝐿∑𝑗=1

Ψ𝑗D12

𝑗𝑙H𝑗𝑙 +W𝑙. (2)

The MMSE estimate of the channel H𝑖𝑙 given Y𝑙 in (2) is

H𝑗𝑙 =√𝑝𝑟𝜏D

12

𝑗𝑙Ψ†𝑗

(I+ 𝑝𝑟𝜏

𝐿∑𝑖=1

Ψ𝑖D𝑖𝑙Ψ†𝑖

)−1

Y𝑙. (3)

This MMSE estimate in (3) follows from standard resultsin estimation theory (for example see [33]). We denote theMMSE estimate of the channel between this base station andall users by H𝑙 = [H1𝑙 H2𝑙 . . . H𝐿𝑙]. This notation is usedlater in Section IV.

B. Downlink Transmission

Consider the base station of the 𝑙-th cell. Let the infor-mation symbols to be transmitted to users in the 𝑙-th cell beq𝑙 = [𝑞𝑙1 𝑞𝑙2 . . . 𝑞𝑙𝐾 ]𝑇 and the𝑀×𝐾 linear precoding matrixbe A𝑙 = 𝑓(H𝑙). The function 𝑓(⋅) corresponds to the specific(linear) precoding method performed at the base station. Thesignal vector transmitted by this base station is A𝑙q𝑙. Weconsider transmission symbols and precoding methods suchthat 𝔼[q𝑙] = 0, 𝔼[q𝑙q

†𝑙 ] = I, and tr{A†

𝑙A𝑙} = 1. These(sufficient) conditions imply that the average power constraintat the base station is satisfied.

Now, consider the users in the 𝑗-th cell. The noisy signalvector received by these users is

x𝑗 =

𝐿∑𝑙=1

√𝑝𝑓D

12

𝑗𝑙H𝑗𝑙A𝑙q𝑙 + z𝑗 , (𝐾 × 1 vector) (4)

where z𝑗 is the additive noise. From (4), the signal receivedby the 𝑘-th user can be expressed as

𝑥𝑗𝑘 =

𝐿∑𝑙=1

𝐾∑𝑖=1

√𝑝𝑓𝛽𝑗𝑙𝑘[ℎ𝑗𝑙𝑘1 . . . ℎ𝑗𝑙𝑘𝑀 ]a𝑙𝑖𝑞𝑙𝑖 + 𝑧𝑗𝑘, (5)

4We assume that there is time synchronization present in the system forcoherent uplink transmission.

where a𝑙𝑖 is the 𝑖-th column of the precoding matrix A𝑙 and𝑧𝑗𝑘 is the 𝑘-th element of z𝑗 .

C. Achievable Rates

Next, we provide a set of achievable rates using the methodsuggested in [9]. With the above communication scheme, thebase stations have channel estimates while the users do nothave any channel estimate. Therefore, the achievable rateswe derive have a different structure compared to typical rateexpressions. In particular, the effective noise term has channelvariations around the mean in addition to typical terms.

Let 𝑔𝑗𝑘𝑙𝑖 =√𝑝𝑓𝛽𝑗𝑙𝑘 [ℎ𝑗𝑙𝑘1 . . . ℎ𝑗𝑙𝑘𝑀 ] a𝑙𝑖. Now, (5) can be

written in the form

𝑥𝑗𝑘 =𝑀∑𝑙=1

𝐾∑𝑖=1

𝑔𝑗𝑘𝑙𝑖 𝑞𝑙𝑖 + 𝑧𝑗𝑘,

= 𝔼

[𝑔𝑗𝑘𝑗𝑘

]𝑞𝑗𝑘 +

(𝑔𝑗𝑘𝑗𝑘 − 𝔼

[𝑔𝑗𝑘𝑗𝑘

])𝑞𝑗𝑘

+∑

(𝑙,𝑖) ∕=(𝑗,𝑘)

𝑔𝑗𝑘𝑙𝑖 𝑞𝑙𝑖 + 𝑧𝑗𝑘. (6)

In (6), the effective noise is defined as

𝑧′𝑗𝑘 =

(𝑔𝑗𝑘𝑗𝑘 − 𝔼

[𝑔𝑗𝑘𝑗𝑘

])𝑞𝑗𝑘 +

∑(𝑙,𝑖) ∕=(𝑗,𝑘)

𝑔𝑗𝑘𝑙𝑖 𝑞𝑙𝑖 + 𝑧𝑗𝑘. (7)

Now, the effective point-to-point channel described by (6) canbe written in the familiar form

𝑥𝑗𝑘 = 𝔼

[𝑔𝑗𝑘𝑗𝑘

]𝑞𝑗𝑘 + 𝑧

′𝑗𝑘, (8)

where 𝑞𝑗𝑘 is the input, 𝑥𝑗𝑘 is the output, 𝔼[𝑔𝑗𝑘𝑗𝑘 ] is the knownchannel and 𝑧

′𝑗𝑘 is the additive noise. This expectation is

known as it only depends on the channel distribution andnot the instantaneous channel. However, the additive noiseis neither independent nor Gaussian. We use the result in[34] that shows that worst-case uncorrelated additive noiseis independent Gaussian noise of same variance to derive thefollowing achievable rates.

Theorem 1: Consider the point-to-point communicationchannels given by (8). Then, the following set of rates areachievable:

𝑅𝑗𝑘 = 𝐶

⎛⎜⎜⎝

∣∣∣𝔼 [𝑔𝑗𝑘𝑗𝑘]∣∣∣21 + var

{𝑔𝑗𝑘𝑗𝑘

}+∑

(𝑙,𝑖) ∕=(𝑗,𝑘) 𝔼

[∣∣∣𝑔𝑗𝑘𝑙𝑖 ∣∣∣2]⎞⎟⎟⎠ , (9)

where 𝐶(𝜃) = log2(1 + 𝜃).Proof: Consider complex Gaussian 𝒞𝒩 (0, 1) distribu-

tions for all inputs 𝑄𝑗𝑘5. Then, the resulting mutual informa-

tion with this (not necessarily maximizing) input distribution𝐼(𝑄𝑗𝑘;𝑋𝑗𝑘) is an achievable rate. However, this does notresult in a computable expression. To obtain a computablelower bound to this rate, we observe that input random variable𝑄𝑗𝑘 and effective noise 𝑍

′𝑗𝑘 given by (7) are uncorrelated

based on the following: The input distributions are such that

5Upper case symbols are used in this proof to emphasize that these arerandom variables.

Page 5: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2644 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

𝑄𝑗𝑘 is clearly independent of 𝑄𝑙𝑖 for all (𝑙, 𝑖) ∕= (𝑗, 𝑘) and𝑍𝑗𝑘. Furthermore, 𝑄𝑗𝑘 is independent of 𝐺𝑗𝑘

𝑗𝑘 . Therefore,

𝔼

[(𝐺𝑗𝑘

𝑗𝑘 − 𝔼

[𝐺𝑗𝑘

𝑗𝑘

])∣𝑄𝑗𝑘∣2

]=

𝔼

[(𝐺𝑗𝑘

𝑗𝑘 − 𝔼

[𝐺𝑗𝑘

𝑗𝑘

])]𝔼[∣𝑄𝑗𝑘∣2

]= 0.

The variance of the effective noise is

𝔼

[∣𝑍 ′

𝑗𝑘∣2]= var

{𝐺𝑗𝑘

𝑗𝑘

}+

∑(𝑙,𝑖) ∕=(𝑗,𝑘)

𝔼

[∣𝐺𝑗𝑘

𝑙𝑖 ∣2]+ 1.

Now, from [34] (Theorem 1), we know the result that thechannel with independent Gaussian noise 𝑍𝑗𝑘 with samevariance given by

��𝑗𝑘 = 𝔼

[𝐺𝑗𝑘

𝑗𝑘

]𝑄𝑗𝑘 + 𝑍𝑗𝑘

is worse, i.e.,

𝐼(𝑄𝑗𝑘;𝑋𝑗𝑘) ≥ 𝐼(𝑄𝑗𝑘; ��𝑗𝑘)

= ℎ(��𝑗𝑘)− ℎ(��𝑗𝑘∣𝑄𝑗𝑘)

= ℎ(��𝑗𝑘)− ℎ(𝑍𝑗𝑘).

This completes the proof.The set of achievable rates given by (9) is valid for any lin-

ear precoding method, and depends on the precoding methodthrough the expectation and variance terms appearing in (9).Similar achievable rates are used in the single-cell setting aswell to study and/or compare precoding methods. Next, weperform pilot contamination analysis using these achievablerates.

IV. PILOT CONTAMINATION ANALYSIS

We analyze the pilot contamination problem in the fol-lowing setting: one user per cell (𝐾 = 1), same trainingsequence used by all users (𝝍𝑗1 = 𝝍, ∀𝑗) and matched-filter(MF) precoding. We consider this setting as it captures theprimary effect of pilot contamination which is the correlationbetween the precoding matrix (vector in this setting) usedby the base station in a cell and channel to users in othercells. We provide simple and insightful analytical results inthis setting. As mentioned earlier, we emphasize that the pilotcontamination problem results from uplink training with non-orthogonal training sequences, and hence, it is not specific tothe setting considered here. However, the level of its impacton the achievable rates would vary depending on the systemsettings.

In order to simplify notation, we drop the subscripts asso-ciated with the users in every cell. In this section, H𝑗𝑙, H𝑗𝑙

and A𝑙 are vectors and we denote these using h𝑗𝑙, h𝑗𝑙 anda𝑙, respectively. The matched-filter used at the base station inthe 𝑙-th cell is given by a𝑙 = h†

𝑙𝑙/∥h𝑙𝑙∥. The user in the 𝑗-thcell receives signal from its base station and from other basestations. From (4), this received signal is

𝑥𝑗𝑘 =√𝑝𝑓𝛽𝑗𝑗h𝑗𝑗a𝑗𝑞𝑗 +

∑𝑙 ∕=𝑗

√𝑝𝑓𝛽𝑗𝑙h𝑗𝑙a𝑙𝑞𝑙 + 𝑧𝑗. (10)

We compute first and second order moments of the effectivechannel gain and the inter-cell interference and use these to

obtain a simple expression for the achievable rate given by(9).

In the setting considered here, the MMSE estimate of h𝑗𝑙

based on Y𝑙 given by (3) can be simplified using matrixinversion lemma and the fact that 𝝍†𝝍 = 1 as follows:

h𝑗𝑙 =√𝑝𝑟𝜏𝛽𝑗𝑙𝝍

†(I+𝝍

(𝑝𝑟𝜏

𝐿∑𝑖=1

𝛽𝑖𝑙

)𝝍†)−1

Y𝑙,

=√𝑝𝑟𝜏𝛽𝑗𝑙𝝍

⎛⎝I− 𝝍

(𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙

)𝝍†

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙

⎞⎠Y𝑙,

=√𝑝𝑟𝜏𝛽𝑗𝑙

⎛⎝𝝍† −

(𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙

)1 + 𝑝𝑟𝜏

∑𝐿𝑖=1 𝛽𝑖𝑙

𝝍†

⎞⎠Y𝑙,

=

√𝑝𝑟𝜏𝛽𝑗𝑙

𝜅𝑙𝝍†Y𝑙,

where 𝜅𝑙 = 1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙.Remark 1: The above channel estimates clearly suggest

the graveness of the pilot contamination impairment. For agiven base station, its estimate of every channel is simplya scaled version of the same vector 𝝍†Y𝑙. Thus, it cannotdistinguish between the channel to its user and other users,which makes pilot contamination a fundamental problem inmulti-cell systems.

Since 𝝍†Y𝑙 is proportional to the MMSE estimate of h𝑗𝑙

for any 𝑗, we have

h†𝑗𝑙

∥h𝑗𝑙∥=

Y†𝑙𝝍

∥𝝍†Y𝑙∥, ∀𝑗. (11)

Using (11), we obtain

h𝑗𝑙a𝑙 = h𝑗𝑙

h†𝑗𝑙∥∥∥h†𝑗𝑙

∥∥∥ ,

=∥∥∥h†

𝑗𝑙

∥∥∥+ h𝑗𝑙

h†𝑗𝑙∥∥∥h†𝑗𝑙

∥∥∥ , (12)

where h𝑗𝑙 = h𝑗𝑙 − h𝑗𝑙. From the properties of MMSEestimation, we know that h𝑗𝑙 is independent of h𝑗𝑙, h𝑗𝑙 is

𝒞𝒩(0,

𝑝𝑟𝜏𝛽𝑗𝑙

𝜅𝑙I), and h𝑗𝑙 is 𝒞𝒩

(0,

1+𝑝𝑟𝜏∑

𝑖∕=𝑗 𝛽𝑖𝑙

𝜅𝑙I). These

results are used next.From (12), we get

𝔼 [h𝑗𝑙a𝑙] = 𝔼

[∥∥∥h†𝑗𝑙

∥∥∥] ,=

√𝑝𝑟𝜏𝛽𝑗𝑙𝜅𝑙

𝔼 [𝜃] , (13)

where 𝜃 =√∑𝑀

𝑚=1 ∣𝑢𝑚∣2 and {𝑢𝑚} is i.i.d. 𝒞𝒩 (0, 1). From(12), we also have

𝔼

[∥h𝑗𝑙a𝑙∥2

]= 𝔼

[∥∥∥h†𝑗𝑙

∥∥∥2]+ 𝔼

⎡⎣ h𝑗𝑙∥∥∥h†

𝑗𝑙

∥∥∥ h†𝑗𝑙h𝑗𝑙

h†𝑗𝑙∥∥∥h†𝑗𝑙

∥∥∥⎤⎦ ,

=𝑝𝑟𝜏𝛽𝑗𝑙𝜅𝑙

𝔼[𝜃2] +1 + 𝑝𝑟𝜏

∑𝑖∕=𝑗 𝛽𝑖𝑙

𝜅𝑙. (14)

Page 6: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2645

Next, we state two lemmas required to obtain a closed-formexpression for the achievable rate.

Lemma 2: The effective channel gain in (10) has expecta-tion

𝔼

[√𝑝𝑓𝛽𝑗𝑗h𝑗𝑗a𝑗

]=

(𝑝𝑓𝛽𝑗𝑗

𝑝𝑟𝜏𝛽𝑗𝑗

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑗

)1/2

𝔼[𝜃]

and variance var{√𝑝𝑓𝛽𝑗𝑗h𝑗𝑗a𝑗

}=

𝑝𝑓𝛽𝑗𝑗

(𝑝𝑟𝜏𝛽𝑗𝑗

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑗var{𝜃}+ 1 + 𝑝𝑟𝜏

∑𝑖∕=𝑗 𝛽𝑖𝑗

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑗

).

Proof: The proof follows from (13) and (14). Note thatvar{𝜃} = 𝔼[𝜃2]− (𝔼[𝜃])2 by definition.

Lemma 3: For both signal and interference terms in (10),the first and second order moments are as follows:

𝔼

[√𝑝𝑓𝛽𝑗𝑙h𝑗𝑙a𝑙𝑞𝑙

]= 0,

𝔼

[∣∣∣√𝑝𝑓𝛽𝑗𝑙h𝑗𝑙a𝑙𝑞𝑙

∣∣∣2] =𝑝𝑓𝛽𝑗𝑙

(𝑝𝑟𝜏𝛽𝑗𝑙

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙𝔼[𝜃2] +

1 + 𝑝𝑟𝜏∑

𝑖∕=𝑗 𝛽𝑖𝑙

1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑙

).

Proof: Since 𝔼[𝑞𝑙] = 0 and 𝑞𝑙 is independent of h𝑗𝑙 anda𝑙, it is clear that

𝔼

[√𝑝𝑓𝛽𝑗𝑙h𝑗𝑙a𝑙𝑞𝑙

]= 0.

The proof of the second order moment follows directly from(14).

The main result of this section is given in the next theo-rem. This theorem provides a closed-form expression for theachievable rates under the setting considered in this section,i.e., one user per cell (𝐾 = 1), same training sequence usedby all users (𝝍𝑗1 = 𝝍, ∀𝑗) and matched-filter (MF) precoding.

Theorem 4: For the setting considered, the achievable rateof the user in the 𝑗-th cell during downlink transmission in(9) is given by

𝐶

⎛⎝ 𝑝𝑓𝛽𝑗𝑗

𝑝𝑟𝜏𝛽𝑗𝑗

𝜅𝑗𝔼2[𝜃]∑

𝑙 ∕=𝑗 𝑝𝑓𝛽𝑗𝑙𝑝𝑟𝜏𝛽𝑗𝑙

𝜅𝑙𝔼[𝜃2] + 𝜁

⎞⎠ , (15)

where

𝜁 = 1 + 𝑝𝑓𝛽𝑗𝑗𝑝𝑟𝜏𝛽𝑗𝑗𝜅𝑗

var{𝜃}+𝐿∑

𝑙=1

𝑝𝑓𝛽𝑗𝑙1 + 𝑝𝑟𝜏

∑𝑖∕=𝑗 𝛽𝑖𝑙

𝜅𝑙,

𝜅𝑗 = 1 + 𝑝𝑟𝜏∑𝐿

𝑖=1 𝛽𝑖𝑗 , 𝔼[𝜃] =Γ(𝑀+ 1

2 )

Γ(𝑀) , 𝔼[𝜃2] = 𝑀 and

var{𝜃} =𝑀 −𝔼2[𝜃]. Here, Γ(⋅) is the Gamma function. For

large M, the following limiting expression for achievable ratecan be obtained:

lim𝑀→∞

𝑅𝑗 = 𝐶

⎛⎜⎝

𝛽2𝑗𝑗

1+𝑝𝑟𝜏∑

𝐿𝑖=1 𝛽𝑖𝑗∑

𝑙 ∕=𝑗

𝛽2𝑗𝑙

1+𝑝𝑟𝜏∑

𝐿𝑖=1 𝛽𝑖𝑙

⎞⎟⎠ . (16)

Proof: The proof of (15) follows by substituting theresults of Lemma 2 and Lemma 3 in (9). Since 𝜃 has a scaled(by a factor of 1/

√2) chi distribution with 2𝑀 degrees of

freedom, it is straightforward to see that 𝔼[𝜃] =Γ(𝑀+ 1

2 )

Γ(𝑀) ,

𝔼[𝜃2] =𝑀 and var{𝜃} =𝑀 − 𝔼2[𝜃].

Using the duplication formula

Γ(𝑧)Γ

(𝑧 +

1

2

)= 2(1−2𝑧)

√𝜋Γ(2𝑧)

and Stirling’s formula

lim𝑛→∞

𝑛!√2𝜋𝑛𝑛𝑛𝑒−𝑛

= 1,

we obtain

lim𝑀→∞

1√𝑀

Γ(𝑀 + 1

2

)Γ(𝑀)

= lim𝑀→∞

√𝜋

𝑀2(1−2𝑀) (2𝑀 − 1)!

(𝑀 − 1)!(𝑀 − 1)!,

= lim𝑀→∞

√𝜋

𝑀2(1−2𝑀)

√2𝜋(2𝑀 − 1)(2𝑀 − 1)(2𝑀−1)

2𝜋(𝑀 − 1)(𝑀 − 1)2(𝑀−1)𝑒,

= lim𝑀→∞

√2𝑀 − 1

2𝑀

(1 +

1

2(𝑀 − 1)

)2𝑀−1

𝑒−1,

= 1.

Therefore, lim𝑀→∞𝔼2[𝜃]𝑀 = 1 and lim𝑀→∞

var{𝜃}𝑀 = 0. This

completes the proof of (16).For large 𝑀 , the value of var{𝜃} (≈ 1/4) is insignificant

compared to 𝑀 . The results of the above theorem showthat the performance does saturate with 𝑀 . Typically, thereverse link is interference-limited, i.e., 𝑝𝑟𝜏

∑𝐿𝑖=1 𝛽𝑖𝑙 ≫ 1, ∀𝑗.

The term∑𝐿

𝑖=1 𝛽𝑖𝑙 is the expected sum of squares of thepropagation coefficients between the base station in the 𝑗-thcell and all users. Therefore,

∑𝐿𝑖=1 𝛽𝑖𝑙 is generally constant

with respect to 𝑗. Using these approximations in (16), we get

𝑅𝑗 ≈ 𝐶(

𝛽2𝑗𝑗∑𝑙 ∕=𝑗 𝛽

2𝑗𝑙

).

This clearly show that the impact of pilot contamination canbe very significant if cross gains (between cells) are of thesame order of direct gains (within the same cell). It suggestsfrequency/time reuse and pilot reuse techniques to reduce thecross gains (in the same frequency/time) relative to the directgains. The benefits of frequency reuse in the limit of an infinitenumber of antennas were demonstrated in [14].

Remark 2: Our result in Theorem 4 is not an asymptoticresult. The expression in (15) is exact for any value of thenumber of antennas 𝑀 at the base stations. Hence, thisexpression can be used to find the appropriate frequency/timereuse scheme for any given value of 𝑀 and other systemparameters. We do not focus on this in this paper, as this woulddepend largely on the actual system parameters including thecell layout and the shadowing model.

Remark 3: The result in Theorem 4 is for the setting withone user per cell. In the general setting with 𝐾 users percell, a similar analysis can be performed, however, it need notsimplify to a simple closed-form expression. The achievablerate in (9) can be numerically evaluated in the general setting,and this can be used to numerically study the impact of pilotcontamination.

Page 7: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2646 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

To summarize, the impact of uplink training with non-orthogonal pilots can be serious when the cross-gains are notsmall compared to the direct gains. This pilot contaminationproblem is often neglected in theory and even in many large-scale simulations. The analysis in this section shows the needto account for this impact especially in systems with highreuse of training sequences. In addition to uplink trainingin TDD systems, which is the focus of this paper, the pilotcontamination problem would appear in other scenarios aswell as it is fundamental to training with non-orthogonalpilots.

Next, we proceed to develop a new precoding methodreferred to as the multi-cell MMSE-based precoding.

V. MULTI-CELL MMSE-BASED PRECODING

In the previous section, we show that pilot contaminationseverely impacts the system performance by increasing theinter-cell interference. In particular, we show that the inter-cellinterference grows like the intended signal with the numberof antennas 𝑀 at the base stations while using zero-forcingprecoding. Therefore, in the presence of pilot contamination,in addition to frequency/time/pilot reuse schemes, it is cru-cial to account for inter-cell interference while designing aprecoding method. Furthermore, since pilot contamination isoriginating from the non-orthogonal training sequences, it isimportant to account for the training sequence allocation whiledesigning a precoding method. The approach of accounting forinter-cell interference while designing a precoding method iscommon, while the approach of accounting for the trainingsequence allocation is not. Again, the usual approach is todecouple the channel estimation and precoding completely.However, while using non-orthogonal pilots, this is not theright approach. These observations follow from our pilotcontamination analysis in the previous section.

The precoding problem cannot be directly formulated asa joint optimization problem as different base stations havedifferent received training signals. In other words, the problemis decentralized in nature. Therefore, one approach is to applysingle-cell precoding methods. For example, since we assumeorthogonal training sequences in every cell, we can performzero-forcing on the users in every cell. The precoding matrixcorresponding to this zero-forcing approach is given by

A𝑙 =G†

𝑙𝑙

(G𝑙𝑙G

†𝑙𝑙

)−1

√tr

[(G𝑙𝑙G

†𝑙𝑙

)−1] , (17)

where G𝑙𝑙 =√𝑝𝑓D

12

𝑙𝑙H𝑙𝑙. However, this zero-forcing pre-coding or other single-cell precoding methods do not accountfor the training sequence allocation, which is potentially theright approach to mitigate the pilot contamination problem.We explore this next.

In order to determine the precoding matrices, we formulatean optimization problem for each precoding matrix. Considerthe 𝑗-th cell. The signal received by the users in this cellgiven by (4) is a function of all the precoding matrices(used at all the base stations). Therefore, the MMSE-basedprecoding methods for single-cell setting considered in [31]

does not extend (directly) to this setting. Let us considerthe signal and interference terms corresponding to the basestation in the 𝑙-th cell. Based on these terms, we formulatethe following optimization problem to obtain the precoding

matrix A𝑙. We use the following notation: F𝑗𝑙 =√𝑝𝑓D

12

𝑗𝑙H𝑗𝑙,

F𝑗𝑙 =√𝑝𝑓D

12

𝑗𝑙H𝑗𝑙 and F𝑗𝑙 = F𝑗𝑙 − F𝑗𝑙 for all 𝑗 and 𝑙. Theoptimization problem is:

minA𝑙,𝛼𝑙

𝔼F𝑗𝑙,z𝑙,q𝑙

[∥𝛼𝑙(F𝑙𝑙A𝑙q𝑙 + z𝑙)− q𝑙∥2 +

∑𝑗 ∕=𝑙

∥𝛼𝑙𝛾(F𝑗𝑙A𝑙q𝑙)∥2∣∣∣∣∣F𝑗𝑙

](18)

subject to tr{A†𝑙A𝑙} = 1.

This objective function is very intuitive. The objective functionof the problem (18) consists of two parts: (i) the sum ofsquares of “errors” seen by the users in the 𝑙-th cell, and (ii)the sum of squares of interference seen by the users in all othercells. The parameter 𝛾 of the optimization problem “controls”the relative weights associated with these two parts. The realscalar parameter 𝛼𝑙 is important as it “virtually” correspondsto the potential scaling that can be performed at the users. Theoptimal solution to the problem (18) denoted by A𝑜𝑝𝑡

𝑙 is themulti-cell MMSE-based precoding matrix.

Next, we obtain a closed-form expression for A𝑜𝑝𝑡𝑙 . The

following lemma is required later for obtaining the optimalsolution to the problem (18).

Lemma 5: Consider the optimization problem (18). For all𝑗 and 𝑙,

𝔼

[F†

𝑗𝑙F𝑗𝑙

∣∣∣F𝑗𝑙

]= 𝛿𝑗𝑙I𝑀 ,

where

𝛿𝑗𝑙 = 𝑝𝑓 tr

{D𝑗𝑙

(I𝐾 + 𝑝𝑟𝜏D

12

𝑗𝑙Ψ†𝑗Λ𝑗𝑙Ψ𝑗D

12

𝑗𝑙

)−1}, (19)

and

Λ𝑗𝑙 =

⎛⎝I+ 𝑝𝑟𝜏

∑𝑖∕=𝑗

Ψ𝑖D𝑖𝑙Ψ†𝑖

⎞⎠

−1

.

Proof: Let f𝑗𝑙𝑚 denote the𝑚-th column of F𝑗𝑙. Similarly,we define h𝑗𝑙𝑚 and h𝑗𝑙𝑚. From (3), we have

f𝑗𝑙𝑚 =√𝑝𝑓D

12

𝑗𝑙(h𝑗𝑙𝑚 − h𝑗𝑙𝑚),

=√𝑝𝑓D

12

𝑗𝑙

(h𝑗𝑙𝑚 −√

𝑝𝑟𝜏D12

𝑗𝑙Ψ†𝑗Δ𝑙y𝑙𝑚

),

where

Δ𝑙 =

(I+ 𝑝𝑟𝜏

𝐿∑𝑖=1

Ψ𝑖D𝑖𝑙Ψ†𝑖

)−1

,

and y𝑙𝑚 is given by (1). For given 𝑗 and 𝑙, it is clear

that{f𝑗𝑙𝑚

}𝑀𝑚=1

is i.i.d. zero-mean 𝒞𝒩 distributed. Hence,

Page 8: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2647

𝔼

[F†

𝑗𝑙F𝑗𝑙

∣∣∣F𝑗𝑙

]= 𝛿𝑗𝑙I𝑀 where

𝛿𝑗𝑙 = 𝔼

[f†𝑗𝑙𝑚 f𝑗𝑙𝑚

∣∣∣F𝑗𝑙

],

= 𝑝𝑓 tr{D

12

𝑗𝑙

(I𝐾 − 𝔼

[h𝑗𝑙𝑚h†

𝑗𝑙𝑚

∣∣∣F𝑗𝑙

])D

12

𝑗𝑙

},

= 𝑝𝑓 tr{D

12

𝑗𝑙

(I𝐾 − 𝑝𝑟𝜏D

12

𝑗𝑙Ψ†𝑗Δ𝑙Ψ𝑗D

12

𝑗𝑙

)D

12

𝑗𝑙

},

= 𝑝𝑓 tr

{D

12

𝑗𝑙

(I𝐾 + 𝑝𝑟𝜏D

12

𝑗𝑙Ψ†𝑗Λ𝑗𝑙Ψ𝑗D

12

𝑗𝑙

)−1

D12

𝑗𝑙

}.

The last step follows from matrix inversion lemma.The main result of this section is given by the following

theorem. This theorem provides a closed-form expression forthe multi-cell MMSE-based precoding matrix.

Theorem 6: The optimal solution to the problem (18) is

A𝑜𝑝𝑡𝑙 =

1

𝛼𝑜𝑝𝑡𝑙

⎛⎝F†

𝑙𝑙F𝑙𝑙 + 𝛾2∑𝑗 ∕=𝑙

F†𝑗𝑙F𝑗𝑙 + 𝜂I𝑀

⎞⎠

−1

F†𝑙𝑙,

(20)where

𝜂 = 𝛿𝑙𝑙 + 𝛾2∑𝑗 ∕=𝑙

𝛿𝑗𝑙 +𝐾,

𝛿𝑗𝑙 is given by (19) and 𝛼𝑜𝑝𝑡𝑙 satisfies tr{(A𝑜𝑝𝑡𝑙 )†A𝑜𝑝𝑡

𝑙 } = 1.Proof: First, we simplify the objective function 𝐽(A𝑙, 𝛼𝑙)

of the problem (18) as follows:

𝐽(A𝑙, 𝛼𝑙)

= 𝔼

[∥𝛼𝑙 (F𝑙𝑙A𝑙q𝑙 + z𝑙)− q𝑙∥2 +∑

𝑗 ∕=𝑙

∥𝛼𝑙𝛾F𝑗𝑙A𝑙q𝑙∥2∣∣∣F𝑗𝑙

],

= 𝔼

[∥(𝛼𝑙F𝑙𝑙A𝑙 − I𝐾)q𝑙∥2 +∑

𝑗 ∕=𝑙

∥𝛼𝑙𝛾F𝑗𝑙A𝑙q𝑙∥2∣∣∣F𝑗𝑙

]+ 𝛼2𝑙𝐾,

= tr{𝔼

[(𝛼𝑙F𝑙𝑙A𝑙 − I𝐾)

†(𝛼𝑙F𝑙𝑙A𝑙 − I𝐾) +∑

𝑗 ∕=𝑙

𝛼2𝑙 𝛾2A†

𝑙F†𝑗𝑙F𝑗𝑙A𝑙

∣∣∣F𝑗𝑙

]}+ 𝛼2𝑙𝐾,

= tr{𝛼2𝑙A

†𝑙 𝔼

[F†

𝑙𝑙F𝑙𝑙

∣∣∣F𝑗𝑙

]A𝑙 − 𝛼𝑙A†

𝑙 F†𝑙𝑙 − 𝛼𝑙F𝑙𝑙A𝑙 +∑

𝑗 ∕=𝑙

𝛼2𝑙 𝛾2A†

𝑙 𝔼

[F†

𝑗𝑙F𝑗𝑙

∣∣∣F𝑗𝑙

]A𝑙

}+(𝛼2𝑙 + 1

)𝐾,

= tr{𝛼2𝑙A

†𝑙

(F†

𝑙𝑙F𝑙𝑙 + 𝛾2∑𝑗 ∕=𝑙

F†𝑗𝑙F𝑗𝑙 − 𝛼𝑙A†

𝑙 F†𝑙𝑙 −

𝛼𝑙F𝑙𝑙A𝑙 +(𝛿𝑙𝑙 + 𝛾

2∑𝑗 ∕=𝑙

𝛿𝑗𝑙

)I𝑀

)A𝑙

}+(𝛼2𝑙 + 1

)𝐾.

The last step follows from Lemma 5.Now, consider the Lagrangian formulation

𝐿 (A𝑙, 𝛼𝑙, 𝜆) = 𝐽 (A𝑙, 𝛼𝑙) + 𝜆(tr{A†

𝑙A𝑙

}− 1)

for the problem (18). Let

R = F†𝑙𝑙F𝑙𝑙 + 𝛾

2∑𝑗 ∕=𝑙

F†𝑗𝑙F𝑗𝑙 +

⎛⎝𝛿𝑙𝑙 + 𝛾2∑

𝑗 ∕=𝑙

𝛿𝑗𝑙 +𝜆

𝛼2𝑙

⎞⎠ I𝑀 ,

U = 𝛼𝑙R12A𝑙 and V = R− 1

2 F†𝑙𝑙. We have

𝐿 (A𝑙, 𝛼𝑙, 𝜆) = ∥U−V∥2 − tr{F𝑙𝑙R

−1F†𝑙𝑙

}+(

𝛼2𝑙 + 1)𝐾 − 𝜆. (21)

This can be easily verified by expanding the right hand side.It is clear from (21) that, for any given 𝛼𝑙 and 𝜆, 𝐿(A𝑙, 𝛼𝑙, 𝜆)is minimized if and only if U = V. Hence, we obtain

A𝑜𝑝𝑡𝑙 =

1

𝛼𝑙R−1F†

𝑙𝑙. (22)

Let 𝐿(𝛼𝑙, 𝜆) = 𝐿(A𝑜𝑝𝑡𝑙 , 𝛼𝑙, 𝜆). Now, we have

𝐿(𝛼𝑙, 𝜆) = − tr{F𝑙𝑙R

−1F†𝑙𝑙

}+(𝛼2𝑙 + 1

)𝐾 − 𝜆. (23)

Note that F†𝑙𝑙F𝑙𝑙 + 𝛾

2∑

𝑗 ∕=𝑙 F†𝑗𝑙F𝑗𝑙 can be factorized in the

form S† diag{[𝑐1 𝑐2 . . . 𝑐𝑀 ]}S where S†S = I𝑀 . Let 𝛿 =𝛿𝑙𝑙 + 𝛾

2∑

𝑗 ∕=𝑙 𝛿𝑗𝑙. Therefore,

R−1

=(S† diag{[𝑐1 𝑐2 . . . 𝑐𝑀 ]}S+

(𝛿 +

𝜆

𝛼2𝑙

)I𝑀

)−1

,

=(S† diag

{[𝑐1 + 𝛿 +

𝜆

𝛼2𝑙. . . 𝑐𝑀 + 𝛿 +

𝜆

𝛼2𝑙

]}S)−1

,

= S† diag{[(

𝑐1 + 𝛿 +𝜆

𝛼2𝑙

)−1

. . .(𝑐𝑀 + 𝛿 +

𝜆

𝛼2𝑙

)−1]}S.

(24)

Substituting (24) in (23), we get

𝐿(𝛼𝑙, 𝜆) = −𝑀∑

𝑚=1

𝑑𝑚

𝑐𝑚 + 𝛿 + 𝜆𝛼2

𝑙

+ (𝛼2𝑙 + 1)𝐾 − 𝜆, (25)

where 𝑑𝑚 is the (𝑚,𝑚)-th entry of SF†𝑙𝑙F𝑙𝑙S

†. Consider theequations obtained by differentiating (25) w.r.t. 𝛼𝑙 and 𝜆 andequating to zero:

𝑀∑𝑚=1

𝑑𝑚(𝑐𝑚 + 𝛿 + 𝜆

𝛼2𝑙

)2 1

𝛼2𝑙= 1, (26)

−𝑀∑

𝑚=1

𝑑𝑚(𝑐𝑚 + 𝛿 + 𝜆

𝛼2𝑙

)2 2𝜆𝛼3𝑙 + 2𝛼𝑙𝐾 = 0. (27)

Substituting (26) in (27), we get

𝜆

𝛼2𝑙= 𝐾. (28)

Combining the results in (22), (24), and (28), we get (20),where 𝛼𝑜𝑝𝑡𝑙 is such that ∥A𝑜𝑝𝑡

𝑙 ∥2𝐹 = 1.The precoding described above is primarily suited for

maximizing the minimum of the rates achieved by all theusers. This is because all users are treated equally withoutdifferentiating them based on the channels. Therefore, whenthe performance metric of interest is sum rate, this precodingcan be combined with power control, scheduling, and othersimilar techniques to enhance the net performance. Since ourmain concern is the inter-cell interference resulting from pilotcontamination, and to avoid too complicated systems, we donot use that possibility in this paper. In the next section,all numerical results and comparisons are performed withoutpower control.

Page 9: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2648 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

5 10 15 20 25 300

5

10

15

20

25

30

Number of Antennas M

Sum

Rat

e (b

its/s

ymbo

l)

Contamination, ZFNo Contamination, ZF

Fig. 3. Comparison of sum rate with and without pilot contamination forthe two-cell system; sum rate saturates in the presence of contamination.

5 10 15 20 25 300

10

20

30

40

50

60

Number of Antennas M

Sum

Rat

e (b

its/s

ymbo

l)

Zero−ForcingGPSMulti−Cell MMSE

Fig. 4. Comparison of different schemes for the two-cell system; Multi-cellMMSE performs inter-cell interference mitigation leading to improved sumrate.

VI. NUMERICAL RESULTS

Multi-Cell MMSE precoding denotes the new precodingmethod developed given in (20) with parameter 𝛾 set to unity.6

ZF precoding denotes the popular zero-forcing precodinggiven in (17). GPS denotes the single-cell precoding methodsuggested in [31], which is a special case of the precodinggiven in (20) with parameter 𝛾 set to zero. In all the plots,we average the performance metric over 103 i.i.d. channelrealizations.

A. Two-cell System

We consider a basic two-cell example to understand theimpact of pilot contamination on the total system throughput(sum rate). In particular, we consider a two-cell system with𝑝𝑓 = 20 dB, 𝑝𝑟 = 10 dB and 𝐾 = 4 users in both cells. Weset all the direct gains to 1 and all cross gains to 𝑎, i.e., forall 𝑘, 𝛽𝑗𝑙𝑘 = 1 if 𝑗 = 𝑙 and 𝛽𝑗𝑙𝑘 = 𝑎 if 𝑗 ∕= 𝑙. We capture themain observations using two plots.

First, in Figure 3, we use cross gain value of 𝑎 = 0.8and plot sum rate versus number of antennas 𝑀 with zero-forcing for training lengths of 𝜏 = 4 (scenario with pilotcontamination) and 𝜏 = 8 (scenario without contamination).With 𝜏 = 4, the orthogonal training sequences used in the 1-st

6In our simulations, we observe that the performance is not very sensitiveto the value of this parameter.

0 0.2 0.4 0.6 0.8 10

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Cross−gain a

R (

bits

/sym

bol)

Zero−Forcing (ZF), b = 0.1aMulti−Cell MMSE, b = 0.1aZero−Forcing (ZF), b = 0.3aMulti−Cell MMSE, b = 0.3a

Fig. 5. Comparison of zero-forcing and multi-cell MMSE precodingmethods; 𝑎 and 𝑏 correspond to different cross-gains and 𝑅 denotes theminimum rate achieved by all users.

is reused in the 2-nd cell. With 𝜏 = 8, all users in the systemare given orthogonal pilots. In Figure 3, we can clearly observethe saturation of total throughput in the presence of pilotcontamination. Note that both scenarios deal with interference.In many practical systems, we cannot necessarily keep 𝜏 large(𝜏 = 8 in this example) as the coherence interval is typicallyvery short, which requires using small 𝜏 .

In the presence of pilot contamination (𝜏 = 4), neitherGPS nor multi-cell MMSE provide noticeable improvementin total throughput (except for small values of 𝑀 ). This isnot surprising as pilot contamination is a very fundamentalproblem in this example.7 For 𝑀 = 4 to 𝑀 = 12, theimprovement using GPS and multi-cell MMSE is reasonable.This improvement results from using MMSE instead of zero-forcing, and hence both GPS and multi-cell MMSE providevery similar performance.8

In the absence of pilot contamination (𝜏 = 8), the story isdifferent as shown in Figure 4. Multi-cell MMSE outperformsboth the single-cell schemes (GPS and zero-forcing) by ahuge margin. This is possible as multi-cell MMSE is capableof performing efficient inter-cell interference mitigation. Notethat this inter-cell interference mitigation is achieved usingthe channel estimates obtained in a distributed manner. Thisclearly shows the advantage of multi-cell MMSE. However,this example only focuses on the scenario without pilotcontamination. Therefore, a natural question is whether multi-cell MMSE can provide throughput gains in a mixed scenario,which is addressed next.

B. Multi-cell System

We consider a multi-cell system with 𝐿 = 4 cells, 𝑀 = 8antennas at all base stations, 𝐾 = 2 users in every cell andtraining length of 𝜏 = 4. We consider 𝑝𝑓 = 20 dB and 𝑝𝑟 = 10dB. Orthogonal training sequences are collectively used withinthe 1-st and 2-nd cells. The training sequences used in the 1-st(2-nd) cell are reused in the 3-rd (4-th) cell. Thus, we modela scenario where training sequences are reused. We keep the

7We can still overcome pilot contamination using frequency/time reuse.8This plot is not provided as it is not the main focus of this paper.

Page 10: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2649

12 16 20 24 28 320

1

2

3

4

5

R (

bits

/sym

bol)

Number of Antennas M

Multi−Cell MMSEGPS

Fig. 6. Comparison of GPS and multi-cell MMSE precoding methods with𝑎 = 0.8 and 𝑏 = 0.1𝑎; 𝑅 denotes the minimum rate achieved by all users.

12 16 20 24 28 320

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Number of Antennas M

R (

bits

/sym

bol)

GPSMulti−Cell MMSE

Fig. 7. Comparison of GPS and multi-cell MMSE precoding methods with𝑎 = 0.8 and 𝑏 = 0.1𝑎; Pilots in cells 1 and 3 (2 and 4) are not reused butrotated by 45 degrees.

propagation factors as follows: for all 𝑘, 𝛽𝑗𝑙𝑘 = 1 if 𝑗 = 𝑙,𝛽𝑗𝑙𝑘 = 𝑎 if (𝑗, 𝑙) ∈ {(1, 2), (2, 1), (3, 4), (4, 3)}, and 𝛽𝑗𝑙𝑘 = 𝑏for all other values of 𝑗 and 𝑙. “Frequency reuse” is handledsemi-quantitatively by adjusting the cross-gains.

Another performance metric of interest is the minimumrate achieved by all users denoted by 𝑅 = min𝑗𝑘 𝑅𝑗𝑘. InFigure 5, we plot the performance of ZF and multi-cell MMSEprecoding methods for different values of 𝑎 and 𝑏. We observesignificant advantage of using multi-cell MMSE precodingfor wide range of values of 𝑎 and 𝑏. In Figure 6, we plotthe performance of GPS and multi-cell MMSE precodingmethods as a function of the number of antennas 𝑀 . We alsoconsider the scenario when pilots in cells 1 and 3 (2 and 4)are not reused but rotated by 45 degrees. This comparisonis given in Figure 7. In both cases, we observe significantadvantages in using our multi-cell MMSE precoding. Thus,the multi-cell MMSE scheme is capable of handling differentscenarios as it utilizes training sequences for precoder designto mitigate inter-cell interference even in the presence of pilotcontamination. Note that in both cases the performance of

8 12 16 20 24 28 320

10

20

30

40

50

60

Number of Antennas M

Sum

Rat

e (b

its/s

ymbo

l)

Zero−ForcingGPSMulti−Cell MMSE

Fig. 8. Comparison of different precoding methods with 𝑎 = 0.8, 𝑏 = 0.1𝑎,𝜏 = 8; Each cell has 𝐾 = 4 users; Sum rate corresponds to total sumthroughput of all 𝐿 = 4 cells.

ZF and GPS are almost indistinguishable, and hence we haveomitted ZF in these plots.

Finally, we consider a system with reused pilots, 𝐾 = 4users in every cell and training length of 𝜏 = 8. In Figure 8, weplot the total sum throughput of different precoding methodsas a function of the number of antennas 𝑀 . In summary,all the numerical results show that the new multi-cell MMSEprecoding offers significant performance gains.

VII. CONCLUSION

In this paper, we characterize the impact of corrupted chan-nel estimates caused by pilot contamination in TDD systems.When non-orthogonal training sequences are assigned to users,the precoding matrix used at a (multiple antenna) base stationbecomes correlated with the channel to users in other cells (re-ferred to as pilot contamination). For a special setting that cap-tures pilot contamination, we obtain a closed-form expressionsfor the achievable rates. Using these analytical expressions,we show that, in the presence of pilot contamination, ratesachieved by users saturate with the number of base stationantennas. We conclude that appropriate frequency/time reusetechniques have to be employed to overcome this saturationeffect. The fact that pilot contamination hasn’t surfaced inFDD studies suggests that researchers are assuming partialCSI with independently corrupted noise, and are not fullyincorporating the impact of channel estimation.

Next, we develop a multi-cell MMSE-based precoding thatdepends on the set of training sequences assigned to the users.We obtain this precoding as the solution to an optimizationproblem whose objective function consists of two parts: (i) themean-square error of signals received at the users in the samecell, and (ii) the mean-square interference caused at the usersin other cells. We show that this precoding method reduce bothintra-cell interference and inter-cell interference, and thus issimilar in spirit to existing joint-cell precoding techniques.The primary differences between joint-cell precoding and ourapproach are that a.) our approach is distributed in nature and,b.) we explicitly take into account the set of training sequencesassigned to the users. Through numerical results, we show

Page 11: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

2650 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 8, AUGUST 2011

that our method outperforms popular single-cell precodingmethods.

REFERENCES

[1] J. Jose, A. Ashikhmin, T. Marzetta, and S. Vishwanath, “Pilot contami-nation problem in multi-cell TDD systems,” in Proc. IEEE InternationalSymposium on Information Theory, July 2009, pp. 2184–2188.

[2] E. Biglieri, R. Calderbank, A. Constantinides, A. Goldsmith, A. Paulraj,and H. V. Poor, MIMO Wireless Communications. Cambridge UniversityPress, 2010.

[3] M. Sharif and B. Hassibi, “On the capacity of MIMO broadcast channelswith partial side information,” IEEE Trans. Inf. Theory, vol. 51, pp. 506–522, Feb. 2005.

[4] P. Ding, D. J. Love, and M. D. Zoltowski, “On the sum rate of channelsubspace feedback for multi-antenna broadcast channels,” in Proc. IEEEGlobal Telecommunications Conference, Nov. 2005, pp. 2699–2703.

[5] N. Jindal, “MIMO broadcast channels with finite-rate feedback,” IEEETrans. Inf. Theory, vol. 52, pp. 5045–5060, Nov. 2006.

[6] T. Yoo, N. Jindal, and A. Goldsmith, “Multi-antenna broadcast channelswith limited feedback and user selection,” IEEE J. Sel. Areas Commun.,vol. 25, pp. 1478–1491, 2007.

[7] A. Ashikhmin and R. Gopalan, “Grassmannian packings for efficientquantization in MIMO broadcast systems,” in Proc. IEEE InternationalSymposium on Information Theory, June 2007, pp. 1811–1815.

[8] K. Huang, R. W. Heath, Jr., and J. G. Andrews, “Space divisionmultiple access with a sum feedback rate constraint,” IEEE Trans. SignalProcess., vol. 55, no. 7, pp. 3879–3891, 2007.

[9] T. L. Marzetta, “How much training is required for multiuser MIMO?”in Proc. Asilomar Conference on Signals, Systems and Computers,Oct./Nov. 2006, pp. 359–363.

[10] J. Jose, A. Ashikhmin, P. Whiting, and S. Vishwanath, “Scheduling andpre-conditioning in multi-user MIMO TDD systems,” in Proc. IEEEInternational Conference on Communications, May 2008, pp. 4100–4105.

[11] B. Vojcic and W. M. Jang, “Transmitter precoding in synchronousmultiuser communications,” IEEE Trans. Commun., vol. 46, no. 10, pp.1346–1355, Oct. 1998.

[12] T. L. Marzetta and B. M. Hochwald, “Fast transfer of channel stateinformation in wireless systems,” IEEE Trans. Signal Process., vol. 54,pp. 1268–1278, 2006.

[13] M. Guillaud, D. T. M. Slock, and R. Knopp, “A practical method forwireless channel reciprocity exploitation through relative calibration,” inProc. ISSPA, 2005.

[14] T. L. Marzetta, “Noncooperative cellular wireless with unlimited num-bers of base station antennas,” IEEE Trans. Wireless Commun., vol. 9,no. 11, pp. 3590–3600, 2010.

[15] G. J. Foschini, K. Karakayali, and R. A. Valenzuela, “Coordinating mul-tiple antenna cellular networks to achieve enormous spectral efficiency,”IEE Proc. Commun., vol. 153, pp. 548–555, Aug. 2006.

[16] S. Venkatesan, A. Lozano, and R. Valenzuela, “Network MIMO: over-coming intercell interference in indoor wireless systems,” in Proc.Asilomar Conference on Signals, Systems and Computers, Nov. 2007,pp. 83–87.

[17] S. Shamai (Shitz), O. Somekh, and B. M. Zaidel, “Multi-cell commu-nications: an information thoeretic perspective,” in Joint Workshop onCommunicaitons and Coding, Oct. 2004.

[18] H. Zhang and H. Dai, “Co-channel interference mitigation and coop-erative processing in downlink multicell multiuser MIMO networks,”European J. Wireless Commun. and Networking, pp. 222–235, 4thquarter 2004.

[19] J. Zhang, R. Chen, J. G. Andrews, A. Ghosh, and R. W. Heath, “Net-worked MIMO with clustered linear precoding,” IEEE Trans. WirelessCommun., vol. 8, no. 4, pp. 1910–1921, 2009.

[20] G. Caire and S. Shamai (Shitz), “On the achievable throughput of amultiantenna Gaussian broadcast channel,” IEEE Trans. Inf. Theory,vol. 49, pp. 1691–1707, July 2003.

[21] P. Viswanath and D. N. C. Tse, “Sum capacity of the vector Gaus-sian broadcast channel and uplink-downlink duality,” IEEE Trans. Inf.Theory, vol. 49, pp. 1912–1921, Aug. 2003.

[22] S. Vishwanath, N. Jindal, and A. J. Goldsmith, “Duality, achievablerates, and sum-rate capacity of Gaussian MIMO broadcast channels,”IEEE Trans. Inf. Theory, vol. 49, pp. 2658–2668, Oct. 2003.

[23] W. Yu and J. M. Cioffi, “Sum capacity of Gaussian vector broadcastchannels,” IEEE Trans. Inf. Theory, vol. 50, no. 9, pp. 1875–1892, 2004.

[24] H. Weingarten, Y. Steinberg, and S. Shamai (Shitz), “The capacity regionof the Gaussian multiple-input multiple-output broadcast channel,” IEEETrans. Inf. Theory, vol. 52, pp. 3936–3964, Sep. 2006.

[25] F. Boccardi, F. Tosato, and G. Caire, “Precoding schemes for the MIMO-GBC,” in Int. Zurich Seminar on Communications, Feb. 2006.

[26] M. Airy, S. Bhadra, R. W. Heath, Jr., and S. Shakkottai, “Transmit pre-coding for the multiple antenna broadcast channel,” in Proc. VehicularTechnology Conference, vol. 3, 2006, pp. 1396–1400.

[27] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector-perturbation technique for near-capacity multiantenna multiuser com-munication part II: perturbation,” IEEE Trans. Commun., vol. 53, pp.537–544, Jan. 2005.

[28] M. Stojnic, H. Vikalo, and B. Hassibi, “Rate maximization in multi-antenna broadcast channels with linear preprocessing,” IEEE Trans.Wireless Commun., vol. 5, pp. 2338–2342, Sep. 2006.

[29] Z. Shen, R. Chen, J. G. Andrews, R. W. Heath, Jr., and B. L. Evans,“Low complexity user selection algorithms for multiuser MIMO systemswith block diagonalization,” IEEE Trans. Signal Process., vol. 54, pp.3658–3663, Sep. 2006.

[30] M. Joham, P. M. Castro, L. Castedo, and W. Utschick, “MMSE optimalfeedback of correlated CSI for multi-user precoding,” in Proc. IEEEInternational Conference on Acoustics, Speech and Signal Processing,Mar. 2008, pp. 3129–3132.

[31] K. S. Gomadam, H. C. Papadopoulos, and C.-E. W. Sundberg, “Tech-niques for multi-user MIMO with two-way training,” in Proc. IEEEInternational Conference on Communications, May 2008, pp. 3360–3366.

[32] G. Caire, N. Jindal, M. Kobayashi, and N. Ravindran, “Multiuser MIMOachievable rates with downlink training and channel state feedback,”IEEE Trans. Inf. Theory, vol. 56, no. 6, pp. 2845–2866, June 2010.

[33] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation. PrenticeHall, 2000.

[34] B. Hassibi and B. M. Hochwald, “How much training is needed inmultiple-antenna wireless links?” IEEE Trans. Inf. Theory, vol. 49, pp.951–963, Apr. 2003.

Jubin Jose is currently a Ph.D. candidate in Elec-trical and Computer Engineering at The Universityof Texas at Austin (UT Austin). He received hisB.Tech. degree in Electrical Engineering from theIndian Institute of Technology (IIT) Madras in 2006and M.S. degree in Electrical and Computer En-gineering from UT Austin in 2008. His industryexperience includes positions at NEC labs, Prince-ton, NJ, Qualcomm Corporate R&D, San Diego,CA and Alcatel-Lucent Bell Labs, Murray Hill,NJ. His current research interests lie in the broad

area of communication networks with emphasis on wireless communication,information theory and network optimization. He received the WNCG studentleadership award in 2011.

Alexei Ashikhmin is a member of technical staffin the Communications and Statistical SciencesDepartment, Bell Laboratories, Murray Hill, NewJersey. He received his Ph.D. degree in ElectricalEngineering from the Institute of Information Trans-mission Problems, Russian Academy of Science,Moscow, Russia, in 1994. In 1995 and 1996, hewas with the Mathematics and Computer ScienceDepartment, Delft University of Technology, TheNetherlands. From 1997 to 1999, he was a Post-doctoral Fellow at Modeling, Algorithms, and In-

formatics Group of Los Alamos National Laboratory. Since 1999, he hasbeen with Bell Laboratories. In 2005-2007, Alexei Ashikhmin was an adjunctprofessor at Columbia University. He taught courses “Error Correcting Codes”and “Communications Theory.”

Alexei Ashikhmin’s research interests include communications theory, thetheory of error correcting codes, and classical and quantum informationtheory. From 2003 to 2006, Dr. Ashikhmin served as an Associate Editor forIEEE Transactions on Information Theory, and is serving again, beginningthis year. In 2002, Dr. Ashikhmin received Bell Laboratories President’sGold Award for breakthrough research resulting in the ability to deliverunprecedented wireless bit-rates. In 2005, Dr. Ashikhmin was honored by theIEEE Communications S.O. Rice Prize Paper Award for work on LDPC codesfor information transmission with multiple antennas. In 2010, Dr. Ashikhminreceived Bell Labs Team Award for Crosstalk Cancellation in DSL systems.Dr. Ashikhmin is a senior member of IEEE Information Theory Society.

Page 12: 2640 IEEE TRANSACTIONS ON WIRELESS …ect.bell-labs.com/who/aea/MyPapers/LSAS_PilotContamination.pdf · taining this CSI is time-division duplex (TDD) operation where ... for forward-link

JOSE et al.: PILOT CONTAMINATION AND PRECODING IN MULTI-CELL TDD SYSTEMS 2651

Thomas L. Marzetta was born in Washington,D.C. He received the PhD in electrical engineer-ing from the Massachusetts Institute of Technologyin 1978. His dissertation extended, to two dimen-sions, the three-way equivalence of autocorrelationsequences, minimum-phase prediction error filters,and reflection coefficient sequences. He worked forSchlumberger-Doll Research (1978 - 1987) to mod-ernize geophysical signal processing for petroleumexploration. He headed a group at Nichols ResearchCorporation (1987 - 1995) which improved auto-

matic target recognition, radar signal processing, and video motion detec-tion. He joined Bell Laboratories in 1995 (formerly part of AT&T, thenLucent Technologies, now Alcatel-Lucent). He has had research supervisoryresponsibilities in communication theory, statistics, and signal processing. Hespecializes in multiple-antenna wireless, with a particular emphasis on theacquisition and exploitation of channel-state information. He is the originatorof Large-Scale Antenna Systems which can provide huge improvements inwireless energy efficiency and spectral efficiency.

Dr. Marzetta was a member of the IEEE Signal Processing SocietyTechnical Committee on Multidimensional Signal Processing, a member of theSensor Array and Multichannel Technical Committee, an associate editor forthe IEEE TRANSACTIONS ON SIGNAL PROCESSING, an associate editor forthe IEEE TRANSACTIONS ON IMAGE PROCESSING, and a guest associateeditor for the IEEE TRANSACTIONS ON INFORMATION THEORY SpecialIssue on Signal Processing Techniques for Space-Time Coded Transmissions(Oct. 2002) and for the IEEE TRANSACTIONS ON INFORMATION THEORY

Special Issue on Space-Time Transmission, Reception, Coding, and SignalDesign (Oct. 2003). Dr. Marzetta was the recipient of the 1981 ASSP PaperAward from the IEEE Signal Processing Society. He was elected a Fellow ofthe IEEE in Jan. 2003.

Sriram Vishwanath received the B.Tech. degreein electrical engineering from the Indian Instituteof Technology (IIT), Madras, India, in 1998, theM.S. degree in electrical engineering from Califor-nia Institute of Technology (Caltech), Pasadena, in1999, and the Ph.D. degree in electrical engineeringfrom Stanford University, Stanford, CA, in 2003.His industry experience includes positions at LucentBell Labs and National Semiconductor Corporation.He is currently an Associate Professor in the De-partment of Electrical and Computer Engineering,

University of Texas, Austin. His research interests include network informa-tion theory, wireless systems, and mobile systems. Dr. Vishwanath receivedthe NSF CAREER award in 2005 and the ARO Young Investigator Awardin 2008. He is the co-recipient of the 2005 IEEE Joint Information TheorySociety and Communications Society best paper award. He has served as thegeneral chair of the IEEE Wireless Network Coding Conference (WiNC) in2010, the local arrangements chair of ISIT 2010, and the Guest Editor-in-Chief of IEEE TRANSACTIONS ON INFORMATION THEORY Special Issueon Interference Networks.