hybrid analog-digital beamforming for massive mimo systems · 1 hybrid analog-digital beamforming...

1

Hybrid Analog-Digital Beamforming for MassiveMIMO Systems

Shahar Stein, Student IEEE and Yonina C. Eldar, Fellow IEEE

Abstract—In massive MIMO systems, hybrid beamformingis an essential technique for exploiting the potential arraygain without using a dedicated RF chain for each antenna. Inthis work, we consider the data phase in a massive MIMOcommunication process, where the transmitter and receiver usefewer RF chains than antennas. We examine several differentfully- and partially connected schemes and consider the designof hybrid beamformers that minimize the estimation error in thedata. For the hybrid precoder, we introduce a framework forapproximating the optimal fully-digital precoder with a feasiblehybrid one. We exploit the fact that the fully-digital precoderis unique only up to a unitary matrix and optimize over thismatrix and the hybrid precoder alternately. Our alternatingminimization of approximation gap (Alt-MaG) framework im-proves the performance over state-of-the-art methods with nosubstantial increase in complexity. In addition, we present aspecial case of Alt-MaG, minimal gap iterative quantization(MaGiQ), that results in low complexity and lower mean squarederror (MSE) than other common methods, in the case of veryfew RF chains. MaGiQ is shown to coincide with the optimalfully-digital solution in some scenarios. For combiner design, weexploit the structure of the MSE objective and develop a greedyratio trace maximization technique, that achieves low MSE undervarious settings. All of our algorithms can be used with multiplehardware architectures.

I. INTRODUCTION

Massive MIMO wireless systems have emerged as a leadingcandidate for 5G wireless access [1], [2]. Along with mmWavetechnologies, that were recently recognized as essential forcoping with the spectrum crunch [3], it offers higher datarates and capacities than traditional MIMO systems. Theuse of large-scale antenna arrays at both the transmitter andreceiver holds the potential for higher array gain than before.To utilize this gain, precoding and combining techniquesare used. Traditionally implemented in the baseband (BB),these methods require a dedicated RF hardware per antenna.Unfortunately, when taking a massive amount of antennas intoaccount, this results in a huge computational load and cost,as the RF components are expensive and have high powerconsumption, especially for mmWave technologies. Hence, itis desirable to design economical hardware that will utilize thepotential gain from a large number of cheap antenna elementsusing a small number of expensive RF chains.

To achieve this goal, several hybrid analog-digital schemeshave been suggested [4]–[13]. In hybrid precoding and com-bining, the operations are split between the digital and analogdomains: at the transmitter side a low dimensional digitalprecoder operates on the transmitted signal at BB. An analog

This project has received funding from the European Union’s Horizon 2020research and innovation program under grant agreement No. 646804-ERC-COG-BNYQ, and from the Israel Science Foundation under Grant no. 335/14.

precoder then maps the small number of digital outputs toa large number of antennas, and the same is performed atthe receiver side. Common analog architectures are basedon analog phase shifters and switches. The specific schemescan vary according to the power and area budget. Two mainfamilies of architectures are the fully- and partially connectedstructures. Fully connected networks offer a mapping fromeach antenna to each RF chain and allow to maximize the pre-coding and combining gain. In the partially connected scheme,a reduced number of analog components is used. This degradesthe achieved gain but offers lower power consumption andhardware complexity.

In the data phase of each coherence interval of the MIMOcommunication process, the transmitter sends multiple datastreams to the receiver over the constant and known channel,using precoding techniques. At the receiver side, a combiner isused to estimate the data vector from the received signal at theantennas. The precoder and combiner are chosen to optimizesome desired performance measure, such as estimation erroror spectral efficiency of the system. Unlike the fully-digitalcase, when considering a hybrid beamformer, the precoderand combiner matrices cannot have arbitrary entries, but areconstrained according to the specific hardware choice. Forexample, when using a phase shifter network at the analogside, only unimodular matrices for the analog beamformerare considered. The goal then is to optimize the performancemeasure over all pairs of digital and analog precoder matrices,which yields a non-convex difficult optimization problem.

The majority of past works considered the fully-connectedphase shifter network. It was suggested in [4] to separate thejoint precoder and combiner design problem into two subprob-lems and solve each independently. Although this approach issub-optimal, it greatly simplifies the difficult joint optimizationproblem. For the precoder, it was shown that minimizingthe gap between the hybrid precoder and the optimal fully-digital one over all hybrid precoders, approximately leads tothe maximization of the system’s spectral efficiency. On thecombiner side, minimizing the mean squared error (MSE)over all hybrid combiners was shown to be equivalent tominimizing the weighted approximation gap between the fully-digital combiner and the hybrid one, with the weights givenby the received signal correlation. Methods for solving theseapproximation problems were suggested in [4]–[6], [8]–[10],[12].

The works [4]–[6] considered precoder design and aimedat maximizing the system’s spectral efficiency. They exploitthe mmWave sparse multipath channel structure and deducethat the optimal fully-digital precoder is composed of a smallsum of steering vectors. They then suggest a variant of the or-

arX

iv:1

712.

0348

5v1

[cs

.IT

] 1

0 D

ec 2

017

2

thogonal matching pursuit (OMP) [14] algorithm to constructa feasible precoder that approximates the optimal one using adictionary of steering vectors. This solution greatly reducesthe problem complexity but results in a large performancegap from the optimal precoder due to restricting the spaceof possible precoder vectors to steering vectors.

Similar approaches were taken in [4], [8], [9] for thecombiner design, under a MSE or spectral efficiency objective.In [10], two algorithms were suggested to approximate theoptimal fully-digital precoder with a feasible one. The first,MO-AltMin, is based on manifold optimization. In each it-eration of the algorithm, it assumes a given digital precoderand develops a conjugate gradient method to find an analogprecoder that is a local minimizer of the approximation gapfrom the fully-digital one. Next, the digital precoder is com-puted using a least squares solution. This method achievesgood performance but suffers from high complexity and runtime, and is only suitable for fully-connected phase shifternetworks. The second approach is a low complexity algorithmthat assumes the digital precoder to be a scaled unitary matrixand uses this assumption to produce an upper bound on theapproximation gap, which is then minimized over all analogprecoders. However, limiting the combiner to such a structureresults in performance loss.

For the partially connected architecture, most of the existingworks concentrated on fixed sub-arrays where each RF chainis connected to a predetermined sub-array. In [11], the authorssuggested a low complexity codebook design producing asmall dictionary of feasible precoding vectors, that are chosenbased on the transmitted signal strength. They then exhaus-tively search over all possible combinations from the smalldictionary to maximize the mutual information between thereceiver and transmitter. This last step can result in heavycomputational load when the channel is not sparse. In [10], theauthors consider disjoint sub-arrays and suggest an iterativemethod to approximate the optimal fully-digital precoder,which optimizes over the analog and digital combiners alter-nately. It is shown that for disjoint arrays, the analog precoderproblem is separable in the antennas, and has a closed formsolution. For the digital precoder, a semidefinite relaxationis suggested. However, this is both computationally heavyand unnecessary since a closed form solution for the digitalcombiner is available. This alternating minimization approachstill results in large performance gap. Some less restrictiveschemes were considered in [12], [13]. In [12], a double phaseshifter with dynamic mapping is considered. The use of twophase shifters per antenna relaxes the unit modulo constraintthat limits the previous solutions, but costs twice the powerconsumption, area and complexity. In [13], a dynamic sub-array approach is considered for orthogonal frequency-divisionmultiplexing (OFDM), and a greedy algorithm to optimize thearray partition based on the long-term channel characteristicsis suggested. However, this method is relevant only for OFDMtransmissions.

Most of the above works concentrated on specific channelmodels and hardware architectures and suggested a tailoredalgorithm for the chosen scenario. In this work, we develop ageneral framework suitable for various channels and hardware:

full and partial networks with different amounts of phaseshifters and switches. We consider the data estimation problemin a single user massive MIMO system where both the receiverand transmitter are equipped with large antenna arrays andfewer RF chains than antennas. We assume a Bayesian modelwhere the data and interference are both random, and aimat minimizing the MSE of the transmitted data from the lowdimensional received digital signal at the receiver, over allhybrid precoders and combiners, assuming a fixed number ofRF chains. Like previous works, we relax the difficult jointoptimization to two separate problems in the precoder andcombiner.

To design the precoder, we present a framework for approxi-mating the optimal fully-digital precoder with a feasible hybridone. Our alternating minimization of approximation gap (Alt-MaG) method, exploits the fact that there exists an infinite setof optimal precoders, which differ by a unitary matrix. Wesuggest optimizing over this matrix and the hybrid precoderalternately, to find the fully-digital solution that results inthe smallest approximation gap from its hybrid decomposi-tion. For the hybrid precoder optimization step, any of thepreviously suggested methods [4]–[6], [8]–[10], [12] may beused, according to the hardware constraints. By optimizingover the unitary matrix as well, Alt-Mag achieves additionalreduction in MSE compared to state-of-the-art algorithms, withno significant increase in complexity.

We then present a simple possible solution for to hybridprecoder optimization step, that results in a low complexity al-gorithm, termed minimal gap iterative quantization (MaGiQ).In each iteration of MaGiQ, the fully-digital solution is ap-proximated using the analog precoder alone. This results in aclosed form solution, given by a simple quantization functionwhich depends on the hardware structure. We demonstrate insimulations that MaGiQ achieves lower MSE than other lowcomplexity algorithms when using very few RF chains. Inaddition, in some specific cases, it coincides with the optimalfully-digital solution.

Next, we show that MaGiQ can also be used for thecombiner design with mild adjustments. We then suggest anadditional greedy ratio trace maximization (GRTM) algorithmthat directly minimizes the estimation error using a suitabledictionary that is chosen according to the hardware scheme.In simulations, we demonstrate that GRTM enjoys good per-formance and short running time, especially when the numberof RF chains increases.

The rest of this paper is organized as follows: In Section IIwe introduce the signal model and problem formulation, andreview common combiner hardware schemes. In Section IIIwe introduce the massive MIMO data estimation problemand derive the relevant MSE minimization over the precoderand combiner. Sections IV and V consider the precoder andcombiner design problems, respectively. In Section VI weevaluate the proposed algorithms using numerical experimentsunder different scenarios.

The following notations are used throughout the paper:boldface upper-case X and lower-case x letters are used todenote matrices and vectors respectively, and non-bold lettersx are scalars. The ith element in the jth column of X is [X]ij .

3

We let I denote the identity matrix of suitable size, XT ,X∗

the transpose and conjugate-transpose of X respectively, E⋅

the expectation and ∥ ⋅ ∥2p, ∥ ⋅ ∥2F the `p and Forbenius norms.The determinant of X is ∣X ∣, and CN (x,X) is the complex-Gaussian distribution with mean x and covariance matrix X .The real part of a variable is denoted as R⋅, R (X) is therange space of X and PX is the orthogonal projection ontoR (X).

II. PROBLEM FORMULATION

A. Signal Model

Consider a single user massive MIMO system in which atransmitter with Nt antennas communicates Ns independentdata streams using N t

RF RF chains, Ns ≤ N tRF ≤ Nt, to a

receiver equipped with Nr antennas and NrRF RF chains, Ns ≤

NrRF ≤ Nr. At the transmitter, the RF chains are followed by a

network of switches and phase shifters that expands the NRFdigital outputs to Nt precoded analog signals which feed thetransmit antennas. Similarly, at the receiver, the antennas arefollowed by a network of switches and phase shifters that feedthe Nr

RF RF chains. The specific architecture of the analoghardware at each end can vary according to budget constraints.Some possible choices are presented in the next subsection.For simplicity, we assume in this paper that N t

RF = NrRF = Ns,

which corresponds to the minimal possible number of chains,and hence the worst-case scenario.

The hybrid architecture enables the transmitter to apply aBB precoder FBB ∈ CN

tRF×N

tRF , followed by a RF precoder

FRF ∈ FNt×NtRF . The properties of the set F are determined

by the specific hardware scheme in use. For example, in fully-connected phase shifter networks, F is the set of unimodularmatrices. The transmitter obeys a total power constraint suchthat ∥FRFFBB∥2F = Ns.

The discrete-time Nt × 1 transmitted signal can be writtenas

x = FRFFBBs, (1)

with s the Ns × 1 symbol vector. We assume, without lossof generality, that s ∼ N (0,INs). The signal is transmittedover a narrowband block-fading propagation channel such thatthe Nr × 1 received analog signal vector y at the receiver’santennas is

y =√prHFRFFBBs + z, (2)

where H is the Nr × Nt channel matrix, pr is the receivedaverage power and z is an Nr×1 interference vector with zeromean and autocorrelation matrix E [zzH] = Rz . Here, z canrepresent either noise or an interfering signal. We assume fullchannel state information (CSI), i.e. the matrix H is known.The interference correlation matrix Rz is also known andassumed to be full rank.

At the receiver, an analog combiner maps the Nr inputs toNrRF RF chains that are then processed at baseband using a

digital combiner. This yields the discrete signal

r =√prW

∗

BBW∗

RFHFRFFBBs +W ∗

BBW∗

RFz, (3)

with WRF ∈ WNr×NrRF the analog combining matrix and

WBB the NrRF ×Nr

RF digital combining matrix. The prop-erties of the set W vary according to the specific analoghardware scheme.

We wish to design WRF ,WBB ,FRF ,FBB to minimizethe estimation error of s from r, under different hardwareconstraints, i.e different feasible setsW,F . Thus, we considerthe problem

minWRF ,WBBFRF ,FBB

E [∥s −W ∗

BBW∗

RFy∥22]

s.t. ∶ FRF ∈ FNt×N

tRF , ∥FRFFBB∥

2F ≤ Ns,

WRF ∈WNr×N

rRF .

(4)

B. Precoder and Combiner Hardware Schemes

We now present some possible hardware schemes and thefeasible sets they dictate. We focus on the combiner archi-tecture and the feasible set W , but all the options presentedhere can be translated to similar precoder architectures andfeasible sets F . The networks below are based on phaseshifters and/or switches, with different connectivity levels. Inpractice, the choice of scheme can be affected by a variety ofbudget constraints such as power consumption, price, and areaagainst the requested array gain. For example, phase shiftersallow for a more flexible design than switches, but their powerconsumption is higher.(S1) Fully Connected Phase Shifters and Switches Net-

work: Each antenna is connected to an RF chainthrough an independent on/off switch followed by aphase shifter. All the incoming analog signals from thedifferent antennas are combined before feeding the RFchain. In this architecture, the entries of the analogcombining matrix WRF can be either zeros or unitmodulus. Hence the set W is defined by WNr×N

rRF =

W ∈ CNr×NrRF ∶ ∣wij ∣ ∈ 0,1. The orthogonal projec-

tion PW onto the feasible set dictated by this scheme isdefined as

[PW (A)]il =

⎧⎪⎪⎨⎪⎪⎩

ej2π∠[A]il , ∣[A]il∣ ≥12

0, ∣[A]il∣ <12

. (5)

The fully-connected phase shifters and switches networkis presented in Fig. 1a.

(S2) Fully Connected Phase Shifters Network: (architectureA1 in [8]). In this case, each antenna is connectedto an RF chain through an independent phase shifter,without a switch. Thus, WRF is a unimodular matrix:WNr×N

rRF = W ∈ CNr×N

rRF ∶ ∣wij ∣ = 1. Here we have

[PW (A)]il = ej2π∠[A]il . (6)

Figure 1b. demonstrates this setting.(S3) Switching Network: (architecture A5 in [8]). This

is an antenna selection scheme, where each RFchain is preceded by a single switch that can tog-gle between the Nr antennas. A total of Nr

RF an-tennas are selected and sampled. It follows thatWRF is partial permutation matrix: WNr×N

rRF =

4

W ∈ CNr×NrRF ∶ wij ∈ 0,1 , ∥wj∥0 = 1. The projec-

tion PW (A) chooses the entry with largest absolutevalue in each column of A and sets the correspondingentry of the output to 1, while all other entries are setto 0. This setting is shown in Fig. 1c. Another structureis the switching network with sub-arrays (architectureA6 in [8]), where the switching range is limited to acertain subset of antennas for each RF chain. Note thatin switching networks, phase shifters are unnecessary,since each input can be processed independently at BB.

(S4) Partially Connected Phase Shifters Network withFixed Sub Arrays: (architecture A2 in [8]). Thisscheme divides the array into (possibly overlapping)sub-arrays of size G. Each sub-array operates as afully-connected phase shifter network with a singleRF chain. This results in the feasible set WNr×N

rRF =

W ∈ CNr×NrRF ∶ ∣wij ∣ = 1,∀i ∈ Sj , ∣wij ∣ = 0,∀i ∉ Sj,

with Sj the group of indices corresponding to antennasthat belong to the jth sub-array. In this case PW (A)

applies the mapping in (6) to the G entries of aj thatbelong to Sj , and sets all others to 0. The fixed sub-arrays network is presented in Fig. 1d. The parameterG represents the connectivity factor; for G = Nr, we getthe fully-connected network S2.

(S5) Flexible Partially Connected Phase Shifters Net-work with Sub Arrays: This architecture is similarto the previous one, except that the contributing an-tennas to each RF chain are not fixed, but can beoptimized, i.e. the subgroups Sj are flexible. This is im-plemented using a Nr-to-1 switch that precedes the phaseshifters. The corresponding feasible set is WNr×N

rRF =

W ∈ CNr×NrRF ∶ ∣wij ∣ ∈ 0,1 , ∥wj∥0 = G. The func-

tion PW (A) in this case chooses the G entries withlargest absolute value in each column of A and applies(6) to them while setting all others to 0.

The networks above constitute some of the main classesof existing analog architectures. They present different levelsof challenge in terms of hardware complexity, as in Table I.For a complete power consumption comparison, the reader isreferred to the example in [8]. One may define many additionalarchitectures based on the above networks by modifying theircomponents and connectivity factors.

III. MSE MINIMIZATION

To optimally design the precoder and combiner we considerthe MSE minimization problem in (4).

Since there are no constraints on WBB in (4), it can bechosen as any Nr

RF ×NrRF matrix, including the linear MMSE

estimator of s from the measurements y = W ∗

RFy. Thisestimator depends on the matrices FRF ,FBB ,WRF and isgiven by

W ∗

BB,opt = H∗

WRF [WRF∗ (HH∗

+Rz)WRF ]−1,(7)

with

H =√prHFRFFBB . (8)

(a) (b)

(c) (d)

(e)

Fig. 1: Combiner hardware schemes. (1a) S1: Fully connectedphase shifters and switches network, (1b) S2: Fully connectedphase shifters network, (1c) S3: Switching network, (1d) S4:Fixed partially connected phase shifters network with sub-arrays, and (1e) S5: Flexible partially connected phase shiftersnetwork with sub-arrays.

The resulting estimate

s = H∗

WRF [WRF∗ (HH∗

+Rz)WRF ]−1

W ∗

RFy, (9)

and its MSE is equal to

E[∥s − s∥2] = Ns−

tr (H∗

WRF [WRF∗ (HH∗

+Rz)WRF ]−1

W ∗

RF H) .

(10)The digital combiner of (7) is the optimal linear estimator

in the MSE sense for any given precoders FRF ,FBB andanalog combiner WRF . Thus, our remaining goal is to designWRF ,FRF ,FBB to minimize (10). This is equivalent to thefollowing maximization problem:

maxWRF ,FRF ,FBB

f (WRF ,FRF ,FBB)


tRF , ∥FRFFBB∥

2F ≤ Ns,

WRF ∈WNr×N

rRF ,

(11)

5

TABLE I: Hardware complexity comparison between different architectures

S1 S2 S3 S4 S5Phase Shifters # Nr ⋅Nr

RF Nr ⋅NrRF 0 G ⋅Nr

RF G ⋅NrRF

Switches # Nr ⋅NrRF 0 Nr

RF 0 G ⋅NrRF

Switch Type on-off - Nr-to-1 - Nr-to-1

with

f (WRF ,FRF ,FBB) = (12)

tr(W ∗

RF HH∗

WRF [WRF∗ (HH∗

+Rz)WRF ]−1

).

Similar to the case considered in [4], the joint optimizationproblem (11) is non-convex and has no known solution.Hence, as in previous works [4], [10], [12], we simplify theproblem by decoupling it. First, in Section IV, we optimizethe precoders FRF ,FBB given WRF . Then, in Section V, weassume fixed FRF ,FBB and optimize the combiner WRF .

IV. PRECODER DESIGN

We now consider the problem of designing the precoderF = FRFFBB . For this purpose we assume a fixed analogcombiner and rewrite (11) as

maxFRF ,FBB

tr(HFF ∗H∗

[HFF ∗H∗

+ R]−1

)


tRF , ∥FRFFBB∥

2F ≤ Ns,

(13)

with H =√prW

∗

RFH and R =W ∗

RFRzWRF . If we relaxthe constraint FRF ∈ FNt×N

tRF to the fully-digital case FRF ∈

CNt×NtRF , then (13) has a closed form solution [15]

F opt = V ΦT (14)

where V contains the first N tRF eigenvectors of the matrix

H∗

R−1H , Φ is an N t

RF ×NtRF diagonal matrix with power

allocation weights as in [15], and T denotes any N tRF ×N

tRF

unitary matrix. It follows that there is a set of optimalsolutions, one for every unitary T .

Previous works [4]–[6], [8], [10], [12] considered T = INtRF

and suggested to approximate F opt with a decompositionFRFFBB , which yields the problem

minFRF ,FBB

∥F opt −FRFFBB∥2F


tRF , ∥FRFFBB∥

2F ≤ Ns.

(15)

This approach is motivated by the approximation in [4] whichshows that minimizing (15) approximately maximizes thesystem’s spectral efficiency. However, these solutions sufferfrom several disadvantages. In [4]–[6], [8], [9], the suggesteddesign for the analog precoder is based on a dictionarycomposed of candidate vectors. This constraint leads to a largeperformance gap from the fully-digital precoder, especially inthe case where the number of RF chains is much less thanthe number of multipath components in the channel H . In[10], the author suggests a manifold optimization algorithmtermed MO-AltMin that achieves good performance, but hasvery high complexity and is suitable only for scheme S2. Thesolution in [12] is based on the network S4 but with twice

the amount of phase shifters per RF chains, and is irrelevantfor other schemes. None of the above works take advantageof the flexibility in choosing T , which we show below canboost the performance of existing solutions with only a smallcomputational price.

Specifically, we exploit the flexibility in the choice ofT , and find the unitary matrix that results in the smallestapproximation gap ∥V ΦT − FRFFBB∥2F . To this end, wedevelop an alternating minimization framework inspired by[16], that alternately optimizes over FRF ,FBB and T . In thesimulations in Section VI, we demonstrate that by adding theoptimization over T , we are able to improve the performanceof state-of-the-art algorithms, with no substantial increase incomplexity.

A. Alt-MaG - Alternating Minimization of Approximation GapPlugging (14) into (15) and adding T as an optimization

variable, our problem becomes

minT ,FRF ,FBB

∥V ΦT −FRFFBB∥2F


tRF , ∥FRFFBB∥

2F ≤ Ns,

T ∈ UNtRF ,

(16)

with UM the set of all M×M unitary matrices. To approximatethe solution of (16), we consider an alternating minimizationapproach. First, we fix FRF ,FBB and solve

minT


s.t. ∶ T ∈ UNtRF .

(17)

Next, using the resulting T , we solve

minFRF ,FBB



tRF , ∥FRFFBB∥

2F ≤ Ns,

(18)

and repeat iteratively.If both (17) and (18) can be solved, then the objective value

decreases in each step, and since it is bounded from below, thealgorithm will converge to a local optimum. However, while(17) has a (closed form) optimal solution, problem (18) is aspecial case of (15), which has no known solution. To attaina sub optimal solution, any of the mentioned algorithms [4]–[6], [8]–[10], [12] can be used (provided it is compatible withthe hardware choice). Each of these solutions will result in adifferent algorithm from the Alt-MaG family.

A closed form solution to problem (17) is given in the nexttheorem.

Theorem 1. Given the svd decomposition F ∗

BBF∗

RFV Φ =

UΛV∗

, the optimal solution T opt to (17) is given by

T opt = V U∗

(19)

6

Proof. First note that ∥V ΦT − FRFFBB∥2F = ∥V Φ∥2F +

∥FRFFBB∥2F − 2R(tr (F ∗

BBF∗

RFV ΦT )). Thus, it is suffi-cient to maximize R(tr (F ∗

BBF∗

RFV ΦT )). Denote Λ = Ω2.Then,

R(tr(F ∗

BBF∗

RFV ΦT ))

≤ ∣tr(F ∗

BBF∗

RFV ΦT )∣

= ∣tr((UΩ) (T ∗V Ω)∗)∣

≤(∗)

√

tr(UΩ2U∗

)tr(T ∗V Ω2V∗

T )

= tr (Ω2) ,

where (∗) stems from the Cauchy-Schwarz inequality. ForT = V U

∗

, the upper bound is achieved with equality.

The resulting family of Alt-MaG algorithms is outlined inAlgorithm 1.

Algorithm 1 Alt-MaG - alternating minimization of approxi-mation gapInput: fully-digital precoder V Φ, threshold tOutput: digital precoder FBB , analog precoder FRF

Initialize T = INtRF,FRF = 0,FBB = 0

While ∥V ΦT −FRFFBB∥2F ≥ t do:1) Calculate (FRF ,FBB) from (18) using one of

the methods in Section IV2) Calculate the svd decomposition F ∗

BBF∗

RFV Φ = UΛV∗

3) Set T = V U∗

Next we suggest a possible simplification to (18) that canbe used with any of the schemes S1-S5, and results in a simpleclosed form solution for FRF ,FBB .

B. MaGiQ - Minimal Gap Iterative Quantization

The complexity of Algorithm 1 is determined by the methodused in step 1 and could be accordingly high. One approachthat yields a low complexity algorithm, is approximatingV ΦT by an analog precoder only, setting the digital precoderto FBB = INt

RF. That is, in every iteration, solving

minFRF

∥V ΦT −FRF ∥2F


tRF .

(20)

The benefit of this approach is the existence of a simple,optimal, closed form solution to (20), given by

FRF = PF (V ΦT ) , (21)

with PF (⋅) defined in Section II.Once FRF and T converge, FBB is calculated as the

solution tominFBB


s.t. ∶ ∥FRFFBB∥2F ≤ NS .

(22)

Note that FRF depends on V ΦT which are a function of therandom channel and interference. With high probability, it hasfull column rank. In this case, (22) has a closed form solution

FBB = [F ∗

RFFRF ]−1

F ∗

RFV ΦT . (23)

Note that since the optimal solution V ΦT obeys the transmitpower constraint ∥V ΦT ∥2F ≤ NS , it follows that

∥FRFFBB∥2F = ∥FRF [F ∗

RFFRF ]−1

F ∗

RFV ΦT ∥2F

= ∥P FRFV ΦT ∥2F ≤ ∥V ΦT ∥

2F ≤ NS .

Therefore, (23) complies with the transmit power constraint.MaGiQ is summarized in Algorithm 2. Since the non-

negative objective function ∥V ΦT −FRF ∥2F does not increaseat each iteration of Algorithm 2, the algorithm converges. Inthe simulations in Section VI, we demonstrate that MaGiQ haslower MSE than other low-complexity methods in the case ofvery few RF chains.

Algorithm 2 MaGiQ - Minimal Gap Iterative QuantizationInput: fully-digital precoder V Φ, threshold tOutput: digital precoder FBB , analog precoder FRF

Initialize T = INtRF,FRF = 0,FBB = INt

RF

While ∥V ΦT −FRF ∥2F ≥ t do:1) Calculate FRF = PF (V ΦT )

2) Calculate the svd decomposition F ∗

RFV Φ = UΛV∗

3) Set T = V U∗

Calculate FBB = [F ∗

RFFRF ]−1

F ∗

RFV ΦT

A version of Algorithm 2, PE-AltMin, has been previouslysuggested in [10] for the scheme S2, as an optimization ofan upper bound on (18). There, the authors assumed that theoptimal FBB is an orthogonal matrix of the form FBB = αT ,and considered the problem

minT ,α,FRF

∥V Φ − αFRFT ∥2F


tRF , ∥αFRFT ∥

2F = Ns,

T ∈ UNtRF .

(24)

Setting α =√

NS∥FRFT ∥F

, and using some mathematical ma-nipulations, they bounded the objective in (24) from abovewith the expression ∥V ΦT ∗

− FRF ∥2F . After optimizing theupper bound over T and FRF alternately, they set FBB asFBB =

√

NS∥FRFT ∥F

T , which is sub optimal, as demonstratedin Section VI, where we compare both methods. The maindifference between this approach and MaGiQ, is that whilewe exploit the degree of freedom in the fully-digital precoder,PE-AltMin restricts the set of possible digital precoders toorthogonal matrices alone, which degrades performance. Inaddition, Algorithm 2 is relevant to any hardware structure.

Alt-MaG and MaGiQ where developed for MSE minimiza-tion, but can be generalized to other performance criteria, byreplacing the optimal unconstrained solution expressions. Forexample, when considering the system’s spectral efficiency asin [4], [10], the optimal solution is (14) with Φ now being adiagonal matrix with the power allocation weights given bywater filling [15].

C. Optimality of MaGiQ

One advantage of MaGiQ over other existing techniques isthat in some special cases it converges to the optimal fully-digital solution, as we show in the following proposition.

7

Proposition 1. Define F opt as in (14), and assume that Φ haspositive values on its diagonal. Denote F the hybrid precoderproduced by MAGiQ for the input F opt. If V Ψ ∈ FNt×N

tRF ,

for some diagonal, positive definite matrix Ψ, then

F = V Φ.

Proposition 1 implies that if each column of V lies inthe feasible set FNt×1 up to some scaling, then MaGiQ willproduce a globally optimal solution. This property is notassured in other algorithms, as demonstrated in the simulationsin Section VI.

Proof. For simplicity, we consider here the fully-connectedphase shifter network S2. However, Proposition 1 holds for anyhardware scheme with the same condition V Ψ ∈ FNt×N

tRF ,

and similar proofs can be constructed accordingly.For S2, V Ψ ∈ FNt×N

tRF implies that V has columns

with constant modulus. In the first iteration of MaGiQ, T isinitialized as T = INt

RF. Since Φ is a diagonal matrix with

positive diagonal values, step 1 of the algorithm leads to

FRF = PF (V Φ) = V Ψ.

It follows that F ∗

RFV Φ = ΨΦ, which is diagonal. Thus,step 3 yields

T = INtRF

and the algorithm converges after a single iteration. For FBB

we get

FBB = [F ∗

RFFRF ]−1

F ∗

RFV Φ = Ψ−1Φ,

and the received hybrid precoder is F = V Φ.

Note that the diagonal matrix Φ represents non-negativeweights [15]. If one of the weights is equal to zero, thenthe corresponding column in the optimal fully-digital precoderis all zeros. In that case, we can drop the relevant column,and use the above proposition with the remaining fully-digitalprecoder.

One example of a scenario in which the conditions ofProposition 1 hold, is a fully-connected phase shifter pre-coder (for simplicity we assume fully-digital combiner) withRz = σ

2zINr , and circulant channel matrix, that is

H =ArΛA∗

t , (25)

where Ar,At are DFT matrices and Λ is a diagonal gainmatrix. In this case, the singular vectors of H∗R−1

z H arecolumns of the DFT matrix which are unimodular, and thuslie in FNt×1 for S2. Therefore, from Proposition 1, MaGiQproduces the optimal solution (14).

A common special case of (25), is the well-known narrow-band mmWave clustered channel [17]–[19], with a single rayat each cluster and asymptotic number of antennas. This modelconsists of Ncl clusters with Nray propagating rays, each rayassociated with transmit and receive directions, and complexgain. The channel matrix can be written as

H =

√NtNrNclNray

Ncl

∑i=1

Nray

∑l=1

αilar (φril, θ

ril)at (φ

til, θ

til)

∗

, (26)

with αil the complex gain of the lth ray in the ith cluster,and (φtil, θ

til), (φril, θ

ril) the azimuth and elevation angles

of departure and arrival for the lth ray in the ith clusterrespectively. The vectors at (φtil, θ

til) and ar (φ

rilθ

ril) represent

the transmit and receive array responses, and depend on thearray geometry.

For Nray = 1, (26) can be written as (25) with Λ a Ncl×Ncldiagonal matrix with the elements

√NtNrNclNray

αil on its diag-onal, and Ar,At being steering matrices composed of thevectors ar (φ

ril, θ

ril) and at (φ

til, θ

til) respectively. As Nr,Nt

grow to infinity, the matrices Ar,At become orthogonal.Hence, in a noise-limited system, V is asymptotically equal tothe steering vectors (up to some normalization constant) andthus unimodular.

The above example can be generalized easily to a hybridfully-connected phase shifter combiner rather than a fully-digital one.

To summarize the results of this section, MaGiQ is a lowcomplexity algorithm that yields low MSE with very few RFchains. When the number of chains increases, Alt-Mag canbe used with other methods to solve problem (18) in its firststep. The choice of algorithm should be made according tothe complexity requirement. For example, if the computationalload is not a limiting factor, then MO-AltMin from [10] is agood choice.

V. ANALOG COMBINER DESIGN

We now turn to design the analog combiner

W =WBBWRF . (27)

For this purpose we assume a fixed precoder, that is, H in (8)is fixed and known.

In (7), we derived the optimal NrRF ×N

rRF digital combiner,

given the hybrid decomposition (27), and reduced the problemto WRF alone. A different approach is to first allow theentire Nr × N

rRF hybrid combiner to be digital, and then

try and approximate the fully-digital solution with a hybriddecomposition, similar to the precoder optimization in theprevious section.

This approach was adopted in [4], [9]. There, the authors be-gun by allowing W to be fully-digital, that is W ∈ CNr×N

rRF ,

and solved problem (4) for fixed precoders. This yields theMMSE estimator of s from y, given by

Wmmse =B−1H, (28)

withB = HH

∗

+Rz. (29)

It was shown in [4] that minimizing (4) (for fixed precoders)is equivalent to solving the following optimization problem:

minWRF ,WBB

∥B12 (Wmmse −WRFWBB) ∥

2F

s.t. ∶ WRF ∈WNr×N

rRF .

(30)

That is, finding the hybrid decomposition of Wmmse thatminimizes the weighted norm in (30).

8

Unlike the precoder case, here (30) and (4) are equivalent.Motivated by this, the authors in [4], [9] assumed a mul-tipath channel structure as in (26), which implies that theoptimal combiner is a sparse sum of steering vectors. Theythen suggested a variation of simultaneous OMP (SOMP) tosolve (30), that chooses the analog combiner vectors froma steering dictionary and sets the digital combiner as thecorresponding weights. However, this algorithm suffers fromdegraded performance due to the dictionary constraint.

By dropping the weights matrix B12 , (30) becomes

minWRF ,WBB

∥Wmmse −WRFWBB∥2F

s.t. ∶ WRF ∈WNt×N

tRF ,

(31)

which constitutes an upper bound on (30) (divided by theconstant factor ∥B

12 ∥2F ), and is identical (up to the power

constraint) to the precoder problem (15) in the previoussection. Hence the methods in [5], [6], [8], [10], [12] can beused to solve it. However, solving (31) is only optimizing anupper bound on the original objective, and is thus sub-optimal.

A. MaGiQ for Combiner Design

As opposed to previous works, we impose the hybriddecomposition (27), and optimize (4) over both WBB andWRF , as in Section III. This leads to the problem of mini-mizing (10) over WRF , which is equivalent to

maxWRF

tr (W ∗

RFAWRF [W ∗

RFBWRF ]−1

)


rRF ,

(32)

with A = HH∗, B as in (29), and W the feasible set of the

given hardware scheme as described in Section II. Note thatsince Rz is positive definite, it follows that B is invertible.

Proposition 2. The optimal fully-digital solution to (32), isgiven by

W opt =B−12US, (33)

where U contains the first NrRF eigenvectors of the matrix

B−12AB−

12 , with A = HH

∗, and H,B are as in (8), and(29). The matrix S is any Nr

RF ×NrRF invertible matrix.

Proof. The fully-digital problem corresponding to (32) is

maxWRF

tr (W ∗

RFAWRF [W ∗

RFBWRF ]−1

) . (34)

Let Q ∈ CNr×NrRF be defined by Q = B

12WRF . Then

the objective in (34) equals to tr(PQB−12AB−

12 ). Let V

be a Nr × NrRF matrix with orthogonal columns that span

R(Q). Then, PQ = V V ∗ and Q = V S for some invert-ible Nr

RF × NrRF matrix S. Thus, the objective becomes

tr(V ∗B−12AB−

12V ) which is maximized by choosing V

as the first NrRF eigenvectors of B−

12AB−

12 , i.e. V = U .

Therefore,Q = US,

which yields the solution (33).

Similarly to the previous section, we can now approxi-mate the optimal unconstrained solution (33) with an analog

combiner, while looking for an S that yields a minimalapproximation gap between the two.

In (33), S is only required to be invertible, in contrast tothe unitary constraint in (14). Hence, limiting it to UN

rRF is

restrictive. However, this is a convenient way to enforce theinvertibility constraint. The optimization problem we consideris therefore

minWRF ,S

∥B−12US −WRF ∥

2F


rRF ,

S ∈ UNrRF ,

(35)

which can be solved using the MaGiQ approach presentedin Section IV. Note that here Alt-MaG and MaGiQ coincide,since in step 1 of Alt-MaG the optimization is over WRF

alone. Once the algorithm converges, WBB is calculated asin (7).

MaGiQ enjoys good performance with very few RF chains,but when the number of chains increases, other methods arepreferable. One of the main factors that degrade MaGiQ,is that it only optimizes a bound (31) on (30), rather thandirectly maximizing (30) (which is equivalent to (32)). Next,we exploit the special structure of the ratio-trace objective in(32), and construct a greedy method for its direct maximizationover WRF . The proposed GRTM algorithm solves a ratio-of-scalars problem at each step and achieves low MSE when thenumber of RF chains increases.

Both GRTM and MaGiQ enjoy low complexity and each hasits merits. For a small number of RF chains or a noise-limitedcase with a fully connected phase shifters network and chan-nel that has unimodular singular vectors, one should chooseMaGiQ. For a more general channel model and increasingnumber of RF chains, GRTM is preferable.

B. GRTM - Greedy Ratio Trace Maximization

We now describe the GRTM approach for directly solving(32). The concept of GRTM is to solve (32) in a greedymanner, where in each iteration we add one RF chain andchoose the optimal combiner vector to add to the previouslyK selected vectors, 1 ≤K ≤ Nr

RF .Assume we have a solution W

(K)

RF ∈ WNr×K for the K-sized problem, i.e. (32) with Nr

RF = K. We now want toadd an additional RF chain, and compute the optimal columnw such that W (K+1)

RF = [W(K)

RF w]. First, note that in orderfor W (K+1) ∗

RF BW(K+1)RF to be invertible and the objective in

(32) to be well defined for the (K + 1)-sized case, we requirew ∉ R(W

(K)

RF ). In practice, this condition implies that eachRF chain contributes new and independent information withrespect to the other chains. Given this condition, the (K + 1)-sized optimization problem is

maxw

tr(W (K+1) ∗RF AW

(K+1)RF [W

(K+1) ∗RF BW

(K+1)RF ]

−1)

s.t. ∶ w ∈WNr×1, w ∉R (W

(K)

RF ) .

(36)

9

In the next proposition, we show that (36) is equivalent tosolving the following vector optimization problem, referred toas the base case:

maxw

w∗Cw

w∗Dws.t. ∶ w ∈W

Nr×1, w∗Dw > 0,(37)

where D and C are specific Nr ×Nr positive semi definite(psd) matrices.

Using Proposition 3 leads to the GRTM solution of (32),where in each iteration we choose the best column vector toadd to the previously selected combiner vectors by solving(37). The GRTM algorithm is summarized in Algorithm 3.

Proposition 3. Problems (37) and (36) are equivalent with

D =B12 (INr −P

B12 W

(K)RF

)B12 ,

γ = tr(PB

12 W

(K)RF

B−12A

12B−

12 ) ,

G =B12P

B12 W

(K)RF

B−12A

12 −A

12 ,

C = γD +GG∗.

Proof. To prove the proposition, we rely on the followinglemma:

Lemma 1. Let W = [W w] and denote Q = (W ∗W )−1.

Then [20]

(W∗

W )−1

= [Q + αQW ∗ww∗WQ∗

−αQW ∗w−αw∗WQ∗ α

] ,

with α = 1w∗w−w∗WQW ∗w

.

Using Lemma 1 and straightforward algebraic operations weget equality between the two objectives. The first constraint isidentical in both problems. It remains to show that the secondconstraints of (37) and (36) are equivalent.

Since B is invertible, w ∈R(W(K)

RF ) if and only if B12w ∈

R(B12W

(K)

RF ). Therefore, if w ∈R(W(K)

RF ), then

w∗Dw =w∗Bw −w∗B12P

B12 W

(K)RF

B12w

=w∗Bw −w∗B12B

12w = 0.

In the other direction, note that D is a non-negative Hermitianmatrix, and can be decomposed as D = QQ∗ for some Q.Thus, w∗Dw = 0 if and only if Q∗w = 0. Multiplyingboth sides of the equation by Q yields Dw = B

12 (INr −

PB

12 W

(K)RF

)B12w = 0. Since B is invertible, it follows that

(INr − PB

12 W

(K)RF

)B12w = 0, or B

12w ∈ R(B

12W

(K)

RF ),

which is equivalent to w ∈R(W(K)

RF ). Hence, we proved that

w ∈R (W(K)

RF ) ⇐⇒ w∗Dw = 0.

Since D is non-negative definite, w∗Dw ≥ 0, for all w. Theprevious connection then yields

w ∉R (W(K)

RF ) ⇐⇒ w∗Dw > 0.

As there is equality between the objectives and feasible setsof (37) and (36), both problems are equivalent.

Problem (37) has several advantages over (36): 1) its objec-tive is a scalar ratio and does not involve matrix inversion. 2)this problem is both a constrained ratio-trace, trace-ratio [21]and a Rayleigh quotient problem, and therefore may be solvedusing techniques for either of these well known problems.However, these methods require adaptation to fit the additionalhardware constraints, which will not be investigated in thiswork.

One simple method for obtaining a (sub-optimal) solution to(37) is searching over a dictionary W with columns inWNr×1,and calculating the objective value in (37) for each vector. Thevector that corresponds to the largest objective is then chosen.This solution has low computational load thanks to the sim-plicity of the scalar ratio objective. The choice of dictionarydepends on the channel model and hardware constraints. Fora fully-connected scheme and a sparse mmWave channel asin (26), a steering vector dictionary as used in [4] exploitsthe channel structure and yields good performance. For amore general model, Gaussian randomization can be used toconstruct a good dictionary. In this case, W = PW (X), withPW (⋅) as defined in Section II, and where X is a randomdictionary, commonly chosen as a matrix with columns drawnfrom CN (0,W optW

∗

opt), with W opt from (33). In thesimulations section, we used the dictionary based solution withGaussian randomization. To comply with the problem’s secondconstraint, it is necessary to remove the selected column fromthe dictionary at each iteration. The other columns are, withhigh probability, not inR(W

(K)

RF ) due to the random complex-Gaussian distribution.

Algorithm 3 GRTMInput effective channel H , interference correlation Rz

Output analog combiner WRF

Initialize A = HH∗, B = HH

∗

+Rz , C =A,D =B,W

(0)RF = []

For K = 0 ∶ NrRF − 1:

1) Solve the base case (37) to obtain w

2) Update W(K+1)RF = [W

(K)

RF w]

3) Update

D =B12 (INr −P

B12 W

(K+1)RF

)B12

γ = tr(PB

12 W

(K+1)RF

B−12A

12B−

12 )

G =B12P

B12 W

(K+1)RF

B−12A

12 −A

12

C = γD +GG∗

Simulations demonstrate that GRTM has lower MSE thanMaGiQ when the number of RF chains increases. In that case,it outperforms other low complexity methods in various SNR,channel models, and hardware scenarios.

C. Combiner Design for Kronecker Model Channel Estima-tion

Problem (32) arises in other communication problems, sothat GRTM can be used to address those problems as well. One

10

such scenario is MIMO channel estimation under a Kroneckermodel in the interference limited case [22], [23].

This problem consists of a single base station (BS) withNBS antennas and NRF < NBS RF chains, and K single-antenna users in a time-synchronized time division duplex(TDD) system. In the uplink channel estimation phase, eachuser sends τ training symbols to the BS, during which thechannel is assumed to be constant.

The (discrete time) received signal at the BS can be writtenas

X =W ∗

RFHST +Z, (38)

where H ∈ CNBS×K is the desired channel between the usersto the BS, S ∈ Cτ×K is the pilot sequence matrix, WRF ∈

WNBS×NRF is the analog combiner and Z is the interference.Both the channel and interference follow the doubly corre-

lated Kronecker model [24]

H =R12r HR

12t , Z =R

12r ZQ

12 , (39)

with Rr ∈ CNbs×Nbs the BS’s receive side correlation matrix,Rt ∈ CK×K the transmit side correlation matrix of theusers and Q ∈ CKτ×τ the transmit side correlation matrixof the interference. The matrices H

Nbs×K , Z ∈ CNbs×τ haveindependent entries with i.i.d complex-normal distribution. Allcorrelation matrices are known.

In the channel estimation phase, the goal is to estimate Hfrom X . For this purpose it is desirable to design the analogcombiner WRF to minimize the MSE in estimation. As shownin [22], the estimation error of the MMSE estimator for Hfrom X is inversely proportional to

µ ≜ tr (W ∗

RFR2rWRF (W ∗

RFRrWRF )−1

) , (40)

which is equal to (32) with A = R2r and B = Rr. Hence,

all the previously mentioned methods may be used to designWRF .

VI. NUMERICAL EXPERIMENTS

We now demonstrate the performance of the suggestedbeamformers under different hardware constraints. Unlessmentioned otherwise, the channel model used is the narrow-band mmWave clustered channel defined in (26), with αil ∼NC (0,1), Ncl = 6, and Nray = 1. We consider a simpleuniform linear array (ULA) with spacing d = λ

2, where λ is the

carrier wavelength. It follows that the steering vectors dependonly on the azimuth and are given by

aq (φ) =1

√N

[1, ej2π sin(φ),⋯,ej2π(Nq−1) sin(φ)] , (41)

for q ∈ r, t. The angles φtil, φril are distributed uniformly over

the interval [0,2π).In all simulations, the number of transmit antennas is

Nt = 10 and the number of receive antennas is Nr = 15. Unlessmentioned otherwise, the interference is a white complex-Gaussian noise with zero mean and variance σ2

z = 1. Forthe precoder experiments, we used a fully-digital receiver, i.e.NrRF = Nr, and for the combiner experiments we used a fully-

digital transmitter, i.e. N tRF = Nt.

A. Hybrid Precoder Performance

First, we present the performance of different hybrid pre-coders. Here, we compare 5 design methods. The first twoare the MO AltMin algorithm from [10], which performsmanifold optimization to minimize (15) over FRF and FBB ,and the PE AltMin method, also from [10], that assumes FBB

to be a scaled orthogonal matrix and employs an algorithmsimilar to Algorithm 2, only with FBB calculated in thelast step as FBB =

√

NS∥FRFT ∥F

T . The third is the SOMP of[4], that reformulates (15) as a sparsity problem and choosesthe columns of FRF from a dictionary of candidate vectors,and simultaneously solves the corresponding weights FBB .Last two are our methods, the MaGiQ in Algorithm 2 andthe Alt-MaG method in Algorithm 1 with the MO AltMinsolution used to obtain FRF ,FBB in step 1. For the Alt-Min algorithms, we used the Matlab’s functions [25]. Thedictionary used for the SOMP algorithm consists of 1000steering vectors at (

2π1000

q)1000

q=1. As performance measure

we consider the gap between the estimation error of theoptimal precoder (14) and the hybrid one, defined by ε − εoptwith ε = 1

Ns⋅Q∑Qq=1 ∥s−s∥

2 and Q the number of realizations.We begin by comparing MaGiQ and PE AltMin considering

a fully-connected phase shifters network S2. As mentionedbefore, these two algorithms are very similar except for the laststep of calculating FBB which is suboptimal in PE AltMin.This is demonstrated in Fig. 2 where MaGiQ’s lower MSEcan be seen. In the next simulations, PE AltMin will not betested.

Figure 3 presents the estimation gap with respect to thenumber of RF chains for the different design methods. TheSOMP algorithm suffers from a large performance gap, espe-cially when the number of RF chains is much less than thenumber of multipath components, i.e. N t

RF ≪ Nc. MaGiQenjoys both low complexity and good performance due tothe alternating minimization over T in (16). The MO-AltMinalgorithm achieves low MSE at the cost of long running time.The gap between Alt-MaG and MO AltMin demonstratesthe potential gain in optimizing over the matrix T in (16),in comparison to optimizing only FRF ,FBB as in (15).This approach substantially improves the performance, withnegligible increase in complexity.

B. Hybrid Combiner Performance

We now investigate the combiner’s performance. Here, anadditional algorithm is tested, the GRTM in Algorithm 3, withthe dictionary-based method in Section V-B as the base casesolution. In contrast to the precoder case, here the SOMPalgorithm solves the problem (30) as suggested in [4].

Figure 4 presents the estimation gap from the optimalcombiner for all approaches with the fully-connected phaseshifters scheme S2. It can be seen that MO-AltMin providesthe best results. However, its runtime is an order-of-magnitudelarger than the runtime of the other methods. Again, in thecase of very few RF chains, MaGiQ is preferred over thegreedy methods GRTM and SOMP, which suffer from largeperformance gap in that scenario. With additional RF chains,

11

3 4 53

4

5

6

7

8

9

10

1110-3

PE_AltMinMaGiQ

Fig. 2: Precoder estimation gap ε − εopt vs. number of RFchains N t

RF , for MAGiQ and PE AltMin.

2 3 4 50

0.005

0.01

0.015

0.02

0.025

0.03

0.035

SOMPMO_AltMinMaGiQAlt-MaG+MO_AltMin

Fig. 3: Precoder estimation gap ε − εopt vs. number of RFchains N t

RF , for a fully-connected phase shifters network S1.

the greedy algorithms outperform the simple MaGiQ, with aslight advantage to GRTM.

Next, we show that in some special cases, MaGiQ coincideswith the fully-digital solution and outperforms other methods.We demonstrate this using the scenario from the asymptoticexample given in Section IV. Here, H is the mmWave channel(26), with Nc = 4, Nr

RF = 4, and Nr = 150. In this case,MO-AltMin cannot be used due to its long runtime thatgrows with the number of antennas. Since Nc ≪ Nr theasymptotic analysis holds, and H has unimodular singularvectors. Figure 5 shows the estimation error ε of the differentalgorithms. It can be seen that MaGiQ’s performance coincideswith the fully-digital combiner. We used a steering dictionaryfor both SOMP and GRTM. Since the singular vectors ofthe channel also have steering structure, the greedy methodsperformance gap is small, but still exists.

Next, we investigate the partially connected schemes. Fig-ure 6 shows the performance of different combiners for thefixed sub-arrays network S4 and for the flexible one S5. For

3 4 5 6 70

0.01

0.02

0.03

0.04

0.05

0.06

SOMPMO_AltMinMaGiQGRTM

Fig. 4: Combiner estimation gap ε − εopt vs. number of RFchains N t

RF , for a fully-connected phase shifters network S1.

both scenarios, we set G = 5. We first notice the additionalperformance gain of the flexible architecture. This result isnatural since the additional switches allow each RF chain tochoose the antennas that contribute to it, in contrast to the fixedsub-arrays case, that enforces each RF chain to be connectedto a small predetermined array. In both scenarios, the greedymethods outperform MaGiQ, but while SOMP yields betterperformance for the fixed arrays, when adding switches toallow for flexibility, GRTM is preferred.

The fully-connected network S1, that involves switches aswell as phase shifters is considered next. Here, we use ageneral channel model such that H has i.i.d complex-Gaussianentries and the interference z is now a complex-Gaussianvector with arbitrary full rank covariance matrix Rz . Forthe SOMP and GRTM algorithms, we used the dictionaryW = PW (X), with W the feasible set for scheme S1,as explained in Section V-B. Figure 7 presents the MSE εas a function of SNR. The complex MO-AltMin algorithmnow achieves approximately the same performance as the lowcomplexity GRTM. This is because MO-AltMin cannot usethe additional flexibility of the switches and produces onlyunimodular beamformer vectors. This demonstrates a trade-off between hardware and computational complexity: one mayuse the simpler architecture S2 while achieving the sameperformance as the complex network S1, with the cost ofusing a heavy computational algorithm such as MO-AltMin. Itcan be further noticed that SOMP’s performance falls short incomparison to others, possibly due to the different interferencestructure.

C. Hardware Schemes Comparison

The last simulation considered the different hardwarechoices. In Fig. 8, the performance of the MaGiQ combinerunder different hardware constraints is demonstrated. As ex-pected, the simple switching network S3 suffers from thelargest MSE, as its connectivity is the most limited. The twopartially connected cases S4 and S5 with G = 5 offer lower

12

0 1 2 3 4 5

0.5

0.55

0.6

0.65

0.7OptimalSOMPMaGiQGRTM

Fig. 5: Combiner estimation error vs. SNR for a fully-connected phase shifters network S2 and mmWave channelwith asymptotic number of antennas.

-2 -1 0 1 20.6

0.65

0.7

0.75

0.8

0.85

0.9

OptimalSOMP_PartiallMaGiQ_PartiallGRTM_PartiallSOMP_FlexibleMaGiQ_FlexibleGRTM_Flexible

Fig. 6: Combiner estimation error vs. SNR for the partiallyconnected phase shifters networks S4 and S5 with G = 5.

2 3 4 50.35

0.4

0.45

0.5

0.55

0.6

OptimalSOMPMO_AltMinMaGiQGRTM

Fig. 7: Combiner estimation error vs. SNR for the fully-connected phase shifters and switches network S1 with generalchannel model and interference.

-5 0 5 100.3

0.4

0.5

0.6

0.7

0.8

0.9

1

OptimalS1-Fully ConnectedS2-Fully Connected Phase OnlyS3-Switching NetworkS4-Fixed Sub ArraysS5-Flexible Sub Arrays

Fig. 8: Combiner estimation error vs. SNR for a partiallyconnected phase shifters network S4 with G = 3.

MSE, due to the additional hardware complexity, where thesecond performs better as a result of the additional switches.The two fully-connected architectures S1 and S2 show similarperformance, suggesting that the additional switches in thefirst offer negligible improvement. This surprising result canbe explained by the channel model which implies that theoptimal combiner consists of sums of steering vectors, thatcan be efficiently described using unimodular vectors.

VII. CONCLUSIONS

We developed a framework for hybrid precoder and com-biner design, suitable for various channel models and hardwaresettings. We considered a data estimation problem and aimedat minimizing the estimation MSE over all possible hybridprecoders/combiners. For the precoder side, we suggested afamily of iterative algorithms, Alt-MaG, that approximatesthe optimal fully-digital precoder while seeking the solutionin the optimal set that results in the smallest approximationgap from the hybrid precoder of the previous iteration. Thepotential gain in exploiting the unitary degree of freedom inthe fully-digital precoder was demonstrated in simulations.We further suggested a simple low complexity algorithm,MaGiQ, that achieves good approximation using a simplequantization function, and yields better performance than othermethods suggested in the literature. We also showed that insome special cases it coincides with the optimal fully-digitalsolution.

Next, we adjusted the MaGiQ algorithm so that it can beused for combiner design. We then suggested an additionalgreedy algorithm, GRTM, that directly minimizes the MSEusing a simple scalar-ratio objective at each iteration. GRTMachieves lower MSE than MaGiQ when the number of RFchains increases. Experimental results showed that our low-complexity algorithms enjoy good performance in variousscenarios.

Finally, using simulations, we showed that in partiallyconnected schemes and sparse multipath mmWave channels,adding switches to the analog network offers a large increase in

13

performance, while in fully-connected cases the improvementis negligible.

REFERENCES

[1] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. K.Soong, and J. C. Zhang, “What will 5g be?” IEEE Journal on SelectedAreas in Communications, vol. 32, no. 6, pp. 1065–1082, Jun. 2014.

[2] F. Rusek, D. Persson, Buon Kiong Lau, E. G. Larsson, T. L. Marzetta,and F. Tufvesson, “Scaling up MIMO: opportunities and challenges withvery large arrays,” IEEE Signal Processing Magazine, vol. 30, no. 1, pp.40–60, Jan. 2013.

[3] T. S. Rappaport, Shu Sun, R. Mayzus, Hang Zhao, Y. Azar, K. Wang,G. N. Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeterwave mobile communications for 5g cellular: it will work!” IEEEAccess, vol. 1, pp. 335–349, 2013.

[4] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath,“Spatially sparse precoding in millimeter wave MIMO systems,” IEEETransactions on Wireless Communications, vol. 13, no. 3, pp. 1499–1513, Mar. 2014.

[5] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Hybridprecoding for millimeter wave cellular systems with partial channelknowledge,” in Information Theory and Applications Workshop (ITA),2013. IEEE, 2013, pp. 1–5.

[6] Y.-Y. Lee, C.-H. Wang, and Y.-H. Huang, “A hybrid RF/basebandprecoding processor based on parallel-index-selection matrix-inversion-bypass simultaneous orthogonal matching pursuit for millimeter waveMIMO systems,” IEEE Transactions on Signal Processing, vol. 63, no. 2,pp. 305–317, Jan. 2015.

[7] S. Han, I. Chih-Lin, Z. Xu, and C. Rowell, “Large-scale antenna systemswith hybrid analog and digital beamforming for millimeter wave 5g,”IEEE Communications Magazine, vol. 53, no. 1, pp. 186–194, 2015.

[8] R. Mendez-Rial, C. Rusu, N. Gonzalez-Prelcic, A. Alkhateeb, and R. W.Heath, “Hybrid MIMO architectures for millimeter wave communica-tions: phase shifters or switches?” IEEE Access, vol. 4, pp. 247–267,2016.

[9] M. Kim and Y. H. Lee, “MSE-based hybrid RF/baseband processingfor millimeter-wave communication systems in MIMO interferencechannels,” IEEE Transactions on Vehicular Technology, vol. 64, no. 6,pp. 2714–2720, Jun. 2015.

[10] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimizationalgorithms for hybrid precoding in millimeter wave MIMO systems,”IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3,pp. 485–500, Apr. 2016.

[11] J. Singh and S. Ramakrishna, “On the feasibility of codebook-basedbeamforming in millimeter wave systems with multiple antenna arrays,”IEEE Transactions on Wireless Communications, vol. 14, no. 5, pp.2670–2683, May 2015.

[12] X. Yu, J. Zhang, and K. B. Letaief, “Partially-connected hybrid precod-ing in mm-Wave systems with dynamic phase shifter networks,” arXivpreprint arXiv:1705.00859, 2017.

[13] S. Park, A. Alkhateeb, and R. W. Heath, “Dynamic subarrays for hybridprecoding in wideband mmWave MIMO systems,” IEEE Transactionson Wireless Communications, vol. 16, no. 5, pp. 2907–2920, May 2017.

[14] Y. C. Eldar and G. Kutyniok, Compressed sensing: theory and applica-tions. Cambridge University Press, 2012.

[15] A. Scaglione, P. Stoica, S. Barbarossa, G. Giannakis, and H. Sampath,“Optimal designs for space-time linear precoders and decoders,” IEEETransactions on Signal Processing, vol. 50, no. 5, pp. 1051–1064, May2002.

[16] S. X. Yu and J. Shi, “Multiclass spectral clustering,” in Proceedings ofthe Ninth IEEE International Conference on Computer Vision - Volume2, ser. ICCV ’03. Washington, DC, USA: IEEE Computer Society,2003.

[17] A. Sayeed, “Deconstructing multiantenna fading channels,” IEEE Trans-actions on Signal Processing, vol. 50, no. 10, pp. 2563–2579, Oct. 2002.

[18] P. F. M. Smulders and L. M. Correia, “Characterisation of propagationin 60 GHz radio channels,” Electronics & communication engineeringjournal, vol. 9, no. 2, pp. 73–80, 1997.

[19] H. Xu, V. Kukshya, and T. S. Rappaport, “Spatial and temporal charac-teristics of 60-GHz indoor channels,” IEEE Journal on selected areasin communications, vol. 20, no. 3, pp. 620–630, 2002.

[20] K. B. Petersen, M. S. Pedersen, J. Larsen, K. Strimmer, L. Christiansen,K. Hansen, L. He, L. Thibaut, M. Baro, S. Hattinger, V. Sima, andW. The, “The matrix cookbook,” Tech. Rep., 2006.

[21] H. Wang, S. Yan, D. Xu, X. Tang, and T. Huang, “Trace ratio vs. ratiotrace for dimensionality reduction,” in Computer Vision and PatternRecognition, 2007. CVPR’07. IEEE Conference on. IEEE, 2007, pp.1–8.

[22] S. S. Ioushua and Y. C. Eldar, “Pilot contamination mitigation withreduced RF chains,” Signal Processing Advances in Wireless Communi-cations (SPAWC), July 2017.

[23] E. Bjornson and B. Ottersten, “A framework for training-based es-timation in arbitrarily correlated rician MIMO channels with riciandisturbance,” IEEE Transactions on Signal Processing, vol. 58, no. 3,pp. 1807–1820, Mar. 2010.

[24] A. Tulino, A. Lozano, and S. Verdu, “Impact of Antenna Correlationon the Capacity of Multiantenna Channels,” IEEE Transactions onInformation Theory, vol. 51, no. 7, pp. 2491–2509, Jul. 2005. [Online].Available: http://ieeexplore.ieee.org/document/1459054/

[25] N. Boumal, B. Mishra, P. Absil, and R. Sepulchre, “Manopt, a matlabtoolbox for optimization on manifolds,” J. Mach. Learn. Research, vol.15, pp. 1455-C1459,, Jan. 2014.

http://ieeexplore.ieee.org/document/1459054/

hybrid analog-digital beamforming for massive mimo systems · 1 hybrid analog-digital beamforming...

Documents