ieee transactions on wireless communications, vol....

14
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018 2511 Enhanced Group Sparse Beamforming for Green Cloud-RAN: A Random Matrix Approach Yuanming Shi , Member, IEEE, Jun Zhang , Senior Member, IEEE, Wei Chen , Senior Member, IEEE, and Khaled B. Letaief, Fellow, IEEE Abstract— Group sparse beamforming is a general framework to minimize the network power consumption for cloud radio access networks, which, however, suffers high computational complexity. In particular, a complex optimization problem needs to be solved to obtain the remote radio head (RRH) ordering criterion in each transmission block, which will help to determine the active RRHs and the associated fronthaul links. In this paper, we propose innovative approaches to reduce the complexity of this key step in group sparse beamforming. Specifically, we first develop a smoothed p -minimization approach with the iterative reweighted-2 algorithm to return a Karush–Kuhn– Tucker (KKT) point solution, as well as enhance the capa- bility of inducing group sparsity in the beamforming vectors. By leveraging the Lagrangian duality theory, we obtain closed- form solutions at each iteration to reduce the computational complexity. The well-structured solutions provide opportunities to apply the large-dimensional random matrix theory to derive deterministic approximations for the RRH ordering criterion. Such an approach helps to guide the RRH selection only based on the statistical channel state information, which does not require frequent update, thereby significantly reducing the computation overhead. Simulation results shall demonstrate the performance gains of the proposed p -minimization approach, as well as the effectiveness of the large system analysis-based framework for computing the RRH ordering criterion. Index Terms— Cloud-RAN, green communications, sparse optimization, smoothed p -minimization, Lagrangian duality, and random matrix theory. Manuscript received March 20, 2017; revised July 21, 2017 and November 19, 2017; accepted January 14, 2018. Date of publication February 6, 2018; date of current version April 8, 2018. This work was supported in part by the National Nature Science Foundation of China under Grant 61601290, in part by the Shanghai Sailing Program under Grant 16YF1407700, in part by the Hong Kong Research Grant Council under Grant 16200214, in part by the National Natural Science Foundation of China under Project 61671269 and Project 61621091, and in part by the Chinese National 973 Program under Project 2013CB336600. This paper was presented at the IEEE International Symposium on Information Theory, Barcelona, Spain, July 2016 [1]. The associate editor coordinating the review of this paper and approving it for publication was L. Song. (Corresponding author: Yuanming Shi.) Y. Shi is with the School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China (e-mail: [email protected]). J. Zhang is with the Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong (e-mail: [email protected]). W. Chen is with the Department of Electronic Engineering, Tsinghua University, Beijing 100084, China (e-mail: [email protected]). K. B. Letaief is with Hamad Bin Khalifa University, Doha, Qatar, and also with the Hong Kong University of Science and Technology, Hong Kong (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TWC.2018.2797203 I. I NTRODUCTION N ETWORK densification [2]–[4] has been proposed as a promising way to provide ultra-high data rates, achieve low latency, and support ubiquitous connectivity for the upcoming 5G networks [5]. However, to fully harness the benefits of dense wireless networks, formidable challenges arise, including interference management, radio resource allo- cation, mobility management, as well as high capital expen- diture and operating expenditure. Cloud-RAN emerges as a disruptive technology to deploy cost-effective dense wireless networks [6], [7]. It can significantly improve both energy and spectral efficiency, by leveraging recent advances in cloud computing and network function virtualization [8]. With shared computation resources at the cloud data center and distributed low-cost low-power remote radio heads (RRHs), Cloud-RAN provides an ideal platform to achieve the benefits of network cooperation and coordination [9]. There is a unique characteristic in Cloud-RANs, namely, the high-capacity fron- thaul links are required to connect the cloud center and RRHs [10], [11]. Such links will consume power comparable to that of each RRH, and thus brings new challenges to design green Cloud-RANs. To address this issue, a new performance metric, i.e., the network power consumption, which consists of both the fronthaul link power consumption and RRH transmit power consumption, has been identified in [7] for the green Cloud-RAN design. Unfortunately, the network power minimization problem turns out to be a mixed-integer nonlinear programming prob- lem [12], which is highly intractable. Specifically, the com- binatorial composite objective function involves a discrete component, indicating which RRHs and the corresponding fronthaul links should be switched on, and a continuous component, i.e., beamformers to reduce RRH transmit power. To address this challenge, a unique group sparsity structure in the optimal beamforming vector has been identified in [7] to unify RRH selection and beamformer optimization. Accord- ingly, a novel three-stage group sparse beamforming frame- work was proposed to promote group sparsity in the solution. Specifically, a mixed 1 /2 -minimization approach was first proposed to induce group sparsity in the solution, thereby guiding the RRH selection via solving a sequence of convex feasibility problems, followed by coordinated beamforming to minimize the transmit power for the active RRHs. Although the group sparse beamforming framework pro- vides polynomial-time complexity algorithms via convex 1536-1276 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Upload: others

Post on 08-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018 2511

Enhanced Group Sparse Beamforming for GreenCloud-RAN: A Random Matrix Approach

Yuanming Shi , Member, IEEE, Jun Zhang , Senior Member, IEEE, Wei Chen , Senior Member, IEEE,and Khaled B. Letaief, Fellow, IEEE

Abstract— Group sparse beamforming is a general frameworkto minimize the network power consumption for cloud radioaccess networks, which, however, suffers high computationalcomplexity. In particular, a complex optimization problem needsto be solved to obtain the remote radio head (RRH) orderingcriterion in each transmission block, which will help to determinethe active RRHs and the associated fronthaul links. In this paper,we propose innovative approaches to reduce the complexityof this key step in group sparse beamforming. Specifically,we first develop a smoothed ℓ p-minimization approach with theiterative reweighted-ℓ2 algorithm to return a Karush–Kuhn–Tucker (KKT) point solution, as well as enhance the capa-bility of inducing group sparsity in the beamforming vectors.By leveraging the Lagrangian duality theory, we obtain closed-form solutions at each iteration to reduce the computationalcomplexity. The well-structured solutions provide opportunitiesto apply the large-dimensional random matrix theory to derivedeterministic approximations for the RRH ordering criterion.Such an approach helps to guide the RRH selection only based onthe statistical channel state information, which does not requirefrequent update, thereby significantly reducing the computationoverhead. Simulation results shall demonstrate the performancegains of the proposed ℓ p-minimization approach, as well as theeffectiveness of the large system analysis-based framework forcomputing the RRH ordering criterion.

Index Terms— Cloud-RAN, green communications, sparseoptimization, smoothed ℓ p-minimization, Lagrangian duality,and random matrix theory.

Manuscript received March 20, 2017; revised July 21, 2017 andNovember 19, 2017; accepted January 14, 2018. Date of publicationFebruary 6, 2018; date of current version April 8, 2018. This work wassupported in part by the National Nature Science Foundation of China underGrant 61601290, in part by the Shanghai Sailing Program under Grant16YF1407700, in part by the Hong Kong Research Grant Council under Grant16200214, in part by the National Natural Science Foundation of China underProject 61671269 and Project 61621091, and in part by the Chinese National973 Program under Project 2013CB336600. This paper was presented atthe IEEE International Symposium on Information Theory, Barcelona, Spain,July 2016 [1]. The associate editor coordinating the review of this paperand approving it for publication was L. Song. (Corresponding author:Yuanming Shi.)

Y. Shi is with the School of Information Science andTechnology, ShanghaiTech University, Shanghai 201210, China (e-mail:[email protected]).

J. Zhang is with the Department of Electronic and Computer Engineering,The Hong Kong University of Science and Technology, Hong Kong (e-mail:[email protected]).

W. Chen is with the Department of Electronic Engineering, TsinghuaUniversity, Beijing 100084, China (e-mail: [email protected]).

K. B. Letaief is with Hamad Bin Khalifa University, Doha, Qatar, andalso with the Hong Kong University of Science and Technology, Hong Kong(e-mail: [email protected]).

Color versions of one or more of the figures in this paper are availableonline at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TWC.2018.2797203

I. INTRODUCTION

NETWORK densification [2]–[4] has been proposed as apromising way to provide ultra-high data rates, achieve

low latency, and support ubiquitous connectivity for theupcoming 5G networks [5]. However, to fully harness thebenefits of dense wireless networks, formidable challengesarise, including interference management, radio resource allo-cation, mobility management, as well as high capital expen-diture and operating expenditure. Cloud-RAN emerges as adisruptive technology to deploy cost-effective dense wirelessnetworks [6], [7]. It can significantly improve both energyand spectral efficiency, by leveraging recent advances incloud computing and network function virtualization [8]. Withshared computation resources at the cloud data center anddistributed low-cost low-power remote radio heads (RRHs),Cloud-RAN provides an ideal platform to achieve the benefitsof network cooperation and coordination [9]. There is a uniquecharacteristic in Cloud-RANs, namely, the high-capacity fron-thaul links are required to connect the cloud center andRRHs [10], [11]. Such links will consume power comparableto that of each RRH, and thus brings new challenges to designgreen Cloud-RANs. To address this issue, a new performancemetric, i.e., the network power consumption, which consists ofboth the fronthaul link power consumption and RRH transmitpower consumption, has been identified in [7] for the greenCloud-RAN design.

Unfortunately, the network power minimization problemturns out to be a mixed-integer nonlinear programming prob-lem [12], which is highly intractable. Specifically, the com-binatorial composite objective function involves a discretecomponent, indicating which RRHs and the correspondingfronthaul links should be switched on, and a continuouscomponent, i.e., beamformers to reduce RRH transmit power.To address this challenge, a unique group sparsity structure inthe optimal beamforming vector has been identified in [7] tounify RRH selection and beamformer optimization. Accord-ingly, a novel three-stage group sparse beamforming frame-work was proposed to promote group sparsity in the solution.Specifically, a mixed ℓ1/ℓ2-minimization approach was firstproposed to induce group sparsity in the solution, therebyguiding the RRH selection via solving a sequence of convexfeasibility problems, followed by coordinated beamforming tominimize the transmit power for the active RRHs.

Although the group sparse beamforming framework pro-vides polynomial-time complexity algorithms via convex

1536-1276 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Page 2: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2512 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

optimization, it has the limited capability to enhance groupsparsity compared to the non-convex approaches [13], [14].In particular, the smoothed ℓp-minimization supported by theiterative reweighted-ℓ2 algorithm was developed in [15] toenhance sparsity for multicast group sparse beamforming.However, the computation burden of this iterative algorithmis still prohibitive in dense wireless networks with a largenumber of RRHs and mobile users. In [16], a generic two-stage approach was proposed to solve large-scale convexoptimization problems in dense wireless cooperative networksvia matrix stuffing and operator splitting, i.e., the alternatingdirection method of multipliers (ADMM) [17], [18]. Thisapproach also enables parallel computing and infeasibilitydetection. However, as the proposed solutions need to beaccomplished for each channel realization, and also due tothe iterative procedures of the group sparse beamformingframework, it is still computationally expensive.

In this paper, we improve the performance of groupsparse beamforming [7] via ℓp-minimization, and a spe-cial emphasis is on reducing the computational complex-ity, with two ingredients: closed-form solutions for eachiteration, and the deterministic equivalents of the optimalLagrange multipliers. Specifically, we first develop an iterativereweighted-ℓ2 algorithm with closed-form solutions at eachiteration via leveraging the principles of the majorization-minimization (MM) algorithm [19] and the Lagrangian dualitytheory [20]. It turns out that the proposed iterative reweighted-ℓ2 algorithm can find a KKT point for the smoothed non-convex ℓp-minimization problem. Furthermore, this reveals theexplicit structures of the optimal solutions to each subproblemin the iterative reweighted-ℓ2 algorithm, while the convexoptimization approach in [7], [15], and [21] fails to obtain theclosed-form solutions. Thereafter, the well-structured closed-form solutions provide opportunities to leverage the large-dimensional random matrix theory [22]–[24] to perform theasymptotic analysis in the large system regimes [25], [26].Specifically, the deterministic equivalents of the optimalLagrange multipliers are derived based on the recent resultsof large random matrix analysis [27] to decouple the depen-dency of system parameters. These results are further usedto perform an asymptotic analysis for computing the RRHordering criterion, which only depends on long-term channelattenuation, thereby significantly reducing the computationoverhead compared with previous algorithms that heavilydepends on instantaneous CSI [7], [15], [21], [28]–[31].

Based on the above proposal, we provide a three-stageenhanced group sparse beamforming framework for greenCloud-RAN via random matrix theory. Specifically, in thefirst stage, we compute the enhanced RRH ordering criteriononly based on the statistical CSI. This thus avoids frequentupdates, thereby significantly reducing computational com-plexity, which is a sharp difference compared to the originalproposal in [8]. This new algorithm is based on the principlesof the MM algorithm and the Lagrangian duality theory,followed by the random matrix theory. With the obtained deter-ministic equivalents of RRH ordering criterion, a two-stagelarge-scale convex optimization framework [16] is adopted tosolve a sequence of convex feasibility problems to determine

the active RRHs in the second stage, as well as solving thetransmit power minimization problem for the active RRHsin the final stage. Note that global instantaneous CSI isrequired for these two stages. Simulation results are providedto demonstrate the improvement of the enhanced group sparsebeamforming framework. Moreover, the deterministic approxi-mations turn out to be accurate even in the finite-sized systems.

A. Related Works1) Sparse Optimization in Wireless Networks: The sparse

optimization paradigm has recently been popular for com-plicated network optimization problems in wireless networkdesign, e.g., the group sparse beamforming framework forgreen Cloud-RAN design [7], [15], [30]–[32], wireless cachingnetworks [33], [34], user admission control [15], [35], as wellas computation offloading [36]. In particular, the convexrelaxation approach provides a principled way to inducesparsity via the ℓ1-minimization [35] for individual sparsityinducing and the mixed ℓ1/ℓ2-minimization for group sparsityinducing [7]. The reweighted-ℓ1 algorithm [13] and thereweighted-ℓ2 algorithm [15] were further developed toenhance sparsity. To enable parallel and distributed computing,the first-order method ADMM algorithm was adopted to solvethe group sparse beamforming problems [37]. A generic large-scale convex optimization framework was further proposed tosolve general large-scale convex programs in dense wirelessnetworks to enable scalability, parallel computing and infea-sibility detection [16].

However, all the above algorithms need to be computed foreach channel realization, which is computationally expensive.In this paper, we adopt large system analysis to compute theRRH ordering criterion for network adaptation only based onstatistical CSI.

2) Large System Analysis via Random Matrix Theory:Random matrix theory [22] has been proven powerful forperformance analysis, and the understanding and improv-ing algorithms in wireless communications [23], [38], signalprocessing [39], and machine learning [40], [41], especiallyin large dimensional regimes for applications in the era ofbig data [42]. In dense wireless networks, random matrixtheory provides a powerful way for performance analysisand algorithm design. In particular, the large system analysiswas performed for simple precoding schemes, e.g., regular-ized zero-forcing in MISO broadcast channels with imperfectCSI [24]. A random matrix approach to the optimal coor-dinated multicell beamforming for massive MIMO was pre-sented in [43] without close-form expressions for the optimalLagrange multipliers. A simple channel model with two cellsunder different coordination levels was considered in [25].

However, all of the above results cannot be directly appliedin the group sparse beamforming framework in dense Cloud-RAN due to the general channel models with heterogenouspathloss and the complicated beamformer structures with fullycooperative transmission. To determine the RRH orderingcriterion based on statistical CSI, we adopt the techniquein [26] and [27] to compute the closed-forms for the optimalLagrange multipliers for each iteration in the procedure ofreweighted-ℓ2 minimization for sparsity inducing.

Page 3: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2513

B. Organization

The remainder of the paper is organized as follows.Section II presents the system model and problem formulation,followed by performance analysis with the proposed three-stage enhanced group sparse beamforming framework forgreen Cloud-RAN. In Section III, the iterative reweighted-ℓ2algorithm for group sparse beamforming is presented via theprinciple of MM algorithm and duality theory. The largesystem analysis for RRH ordering is performed in Section IV.Simulation results will be demonstrated in Section V. Finally,conclusions and discussions are presented in Section VI.To keep the main text clean and free of technical details,we divert most of the proofs to the Appendix.

Notations: Throughout this paper, ∥ · ∥p is the ℓp-norm.| · | stands for either the size of a set or the absolute valueof a scalar, depending on the context. ∥M∥ is the spectralradius of the Hermitian matrix M. Boldface lower case andupper case letters represent vectors and matrices, respectively.(·)−1, (·)T , (·)H and Tr(·) denote the inverse, transpose, Her-mitian and trace operators, respectively. We use C to representcomplex domain. E[·] denotes the expectation of a randomvariable. We denote A = diag{x1, . . . , xN } and IN as adiagonal matrix of order N and the identity matrix of orderN , respectively.

II. SYSTEM MODEL AND PROBLEM FORMULATION

A. System Model

Consider a Cloud-RAN with L RRHs and K single-antennamobile users (MUs), where each RRH is equipped with Nantennas. In Cloud-RANs, the BBU pool will perform the cen-tralized signal processing and is connected to all the RRHs viahigh-capacity and low-latency fronthaul links. In this paper,we will focus on the downlink signal processing. Specifically,let vlk ∈ CN be the transmit beamforming vector from the l-thRRH to the k-th MU. The received signal yk ∈ C at MU k isgiven by

yk =L∑

l=1

hHklvlk sk +

i =k

L∑

l=1

hHkl vli si + nk, (1)

where hkl ∈ CN is the channel propagation between MU k andRRH l, sk ∈ C with E[|sk |2] = 1 is the encoded transmissioninformation symbol for MU k, and nk ∼ CN (0, σ 2

k ) is theadditive Gaussian noise at MU k.

We assume that sk ’s and nk’s are mutually independentand all the MUs apply single-user detection. The signal-to-interference-plus-noise ratio (SINR) for MU k is given by

sinrk = |hHk vk|2∑

i =k |hHk vi |2 + σ 2

k

, (2)

where hk = [hTk1, . . . , hT

kL ]T ∈ CL N is the vector consistingof the channel coefficients from all the RRHs to MU k, andvk = [vT

1k, . . . , vTLk]T ∈ CL N is the aggregative beamforming

vector for the MU k from all the RRHs.

B. Problem Formulation

In this paper, we aim at designing a green Cloud-RAN byminimizing the network power consumption, which consistsof the fronthaul links power consumption, as well as the RRHtransmit power consumption [7]. Specifically, let v = [vlk] ∈CK L N be the aggregative beamforming vector from all theRRHs to all the MUs. Define the support of the beamformingvector v as T (v) = {i |vi = 0}, where v = [vi ] is indexedby i ∈ V with V = {1, . . . , K L N}. The relative fronthaulnetwork power consumption is given by

f1(v) =L∑

l=1

Pcl I (T (v) ∩ Vl = ∅), (3)

where Pcl ≥ 0 is the relative fronthaul link power con-

sumption [7] (i.e., the static power saving when both thefronthaul link and the corresponding RRH are switched off),Vl = {(l − 1)K N + 1, . . . , l K N},∀l ∈ L, is a partition of V ,and I (T ∩ Vl = ∅) is an indicator function that takes value1 if T ∩Vl = ∅ and 0 otherwise. Note that f1 is a non-convexcombinatorial function.

Furthermore, the total transmit power consumption isgiven by

f2(v) =L∑

l=1

K∑

k=1

1ζl

∥vlk∥22, (4)

where ζl > 0 is the drain inefficiency coefficient of the radiofrequency power amplifier [7]. Therefore, the network powerconsumption is represented by the combinatorial compositefunction

f (v) = f1(v) + f2(v). (5)

Given the QoS thresholds γ = (γ1, . . . , γK ) for all theMUs, in this paper, we aim at solving the following networkpower consumption minimization problem with the QoS con-straints:

P : minimizev

f1(v) + f2(v)

subject to|hH

k vk |2∑

i =k |hHk vi |2 + σ 2

k

≥ γk, ∀k. (6)

Let vl = [v]Vl = [vTl1, . . . , vT

lK ]T ∈ CK N form the beamform-ing coefficient group from RRH l to all the MUs. Note that,when the RRH l is switched off, all the beamforming coeffi-cients in vl will be set to zeros simultaneously. Observing thatthere may be multiple RRHs being switched off to minimizethe network power consumption, the optimal beamformingvector v = [vT

1 , . . . , vTL ]T ∈ CK L N should have a group-

sparsity structure. Therefore, problem P is called a groupsparse beamforming problem [7]. Note that, to simplify thepresentation, we only impose the QoS constraints in problemP . However, the proposed instantaneous CSI based iterativereweighted-ℓ2 algorithm can be extended to the scenario withper-RRH transmit power constraints, following the principlesin [20]. More technical efforts are required to derive theasymptotic results using the random matrix method with morecomplicated structures of the optimal beamformers.

Page 4: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2514 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

C. Problem Analysis

Although the constraints in problem P can be reformu-lated as convex second-order cone constraints, the non-convexobjective function makes it highly intractable. To addressthis challenge, a weighted mixed ℓ1/ℓ2-norm minimizationapproach was proposed in [7] to convexify the objective andinduce the group sparsity in beamforming vector v, therebyguiding the RRH ordering to enable adaptively RRH selection.Specifically, we will first solve the ℓ1/ℓ2-norm minimizationproblem, and denote v⋆1, . . . , v⋆L as the induced (approximated)group sparse beamforming vectors. Then the following RRHordering criterion is adopted to determine which RRHs shouldbe switched off [7]:

θl = κl∥v⋆l ∥22, ∀l = 1, . . . , L, (7)

where κl = ∑Kk=1 ∥hkl∥2

2/νl . In particular, the group sparsitystructure information for beamforming vector v is extractedfrom the squared ℓ2-norm of the beamforming vectors vl’s,i.e., ∥v1∥2

2, . . . , ∥vL∥22. The RRH with a smaller θl will have a

higher priority to be switched off. Based on the RRH orderingresult θ⋆ in (7), a bi-section search approach can be used tofind the optimal active RRHs [7] via solving a sequence ofthe following feasibility problems:

F (A[i]): find v1, . . . , vK

subject to|hH

k vk |2∑i =k |hH

k vi |2 + σ 2k

≥ γk, ∀k, (8)

where vk = [vlk] ∈ C|A|N and hk = [hkl ] ∈ C|A|N . Prob-lem F (A[i]) turns out to be convex via reformulating the QoSconstraints as second-order cone constraints [7]. To furtherenhance the sparsity as well as to seek the quadratic forms ofthe beamforming vectors in the multicast transmission setting,a smoothed ℓp-minimization approach was proposed in [15].To scale to large problem sizes in dense Cloud-RANs, a two-stage large-scale parallel convex optimization framework wasdeveloped in [16] with the capability of infeasibility detection.

However, all of the above developed algorithms bear highcomputation overhead. In particular, while the feasibility prob-lem (8) for RRH selection can be efficiently solved with thelarge-scale optimization algorithm in [8], the RRH orderingcriterion in (7) may be highly complicated to obtain, espe-cially with the non-convex formulation as in [15]. Moreover,the ordering criterion needs to be recomputed for each trans-mission slot, and depends on instantaneous CSI. Observingthat the statistical CSI normally changes much slower thanthe instantaneous CSI, to reduce the computational burden,we propose to compute the RRH ordering criterion (7) onlybased on statistical CSI, i.e., the long-term channel attenuation.This is achieved by first developing a group sparsity penaltywith quadratic forms in the aggregative beamforming vector v,followed by an iterative reweighted-ℓ2 algorithm with closed-form solutions at each iteration via Lagrangian duality theory,as will be presented in Section III. Then asymptotic analysisis performed to obtain the RRH ordering criterion (7) basedonly on statistical CSI by leveraging the large-dimensionalrandom matrix theory [23], [24], [27], as will be presentedin Section IV.

Fig. 1. The proposed three-stage enhanced group sparse beamformingframework for dense green Cloud-RAN. In the first stage, the RRH orderingcriterion θ is computed only based on the statistical CSI via large systemanalysis. The optimal active RRHs A⋆ in the second stage and the optimalcoordinated beamforming for transmit power minimization in the third stageare computed based on the instantaneous CSI via the large-scale convexoptimization algorithm in [16].

Overall, the proposed three-stage enhanced group sparsebeamforming framework is presented in Fig. 1. Specifically,in the first stage, the RRH ordering criterion θ⋆ is calculatedonly based on statistical CSI using Algorithm 2. In the secondstage, the set of active RRHs A⋆ is obtained based oninstantaneous CSI via solving a sequence of convex feasibilityproblems F (A[i]) using the large-scale convex optimizationframework in [16]. In the third stage, the transmit power isminimized by solving the convex program (6) with the fixedactive RRHs A⋆ using the large-scale convex optimizationalgorithm in [16]. Overall, the proposed enhanced group sparsebeamforming framework is scalable to large network sizes.This paper will focus on developing an effective algorithm forthe first stage.

Remark 1: In this paper, we assume that, in the first stage,problem P (6) is feasible for developing Algorithm 1 andAlgorithm 2 to determine the RRH ordering criterion basedon the instantaneous CSI and statistical CSI, respectively.We enable the capability of handing the infeasibility in the sec-ond stage based on the instantaneous CSI, using the large-scale convex optimization algorithm [16]. Furthermore, whenproblem F (8) is infeasible with all RRHs active, we adoptthe user admission algorithm to find the maximum number ofadmitted users [15].

III. GROUP SPARSE BEAMFORMING VIA ITERATIVE

REWEIGHTED-ℓ2 ALGORITHM

In this section, we develop a group sparse beamformingapproach based on the smoothed ℓp-minimization, supportedby an iterative reweighted-ℓ2 algorithm, thereby enhancingthe group sparsity in the beamforming vectors. Instead ofreformulating the QoS constraints in problem P as second-order cone constraints [7], we use the Lagrangian dualitytheory to reveal the structures of the optimal solutions ateach iteration. The results will assist the large system analysisin Section IV.

A. Group Sparsity Inducing Optimization viaSmoothed ℓp-Minimization

To induce the group sparsity structure in the beamformingvector v, thereby guiding the RRH ordering, the weightedmixed ℓ1/ℓ2-norm was proposed in [7]. However, the non-smooth mixed ℓ1/ℓ2-norm fails to introduce the quadraticforms in the beamforming vector v, in order to be compatiblewith the quadratic QoS constraints so that the Lagrangian

Page 5: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2515

duality theory can be applied [20]. To address this chal-lenge, we adopt the following smoothed ℓp-minimization(0 < p ≤ 1) approach to induce group sparsity [14], [15]:

PGSBF: minimizev

gp(v; ϵ) :=L∑

l=1

νl

(∥vl∥2

2 + ϵ2)p/2

subject to|hH

k vk|2∑i =k |hH

k vi |2 + σ 2k

≥ γk, ∀k, (9)

where ϵ > 0 is some fixed regularizing parameter and νl > 0is the weight for the beamforming coefficient group vl byencoding the prior information of system parameters [7].Compared with the mixed ℓ1/ℓ2-norm minimization approach,the ℓp-minimization approach can induce sparser solutionsbased on the fact ∥z∥0 = lim p→0 ∥z∥p

p = lim p→0∑

i |zi |p.Unfortunately, problem (9) is non-convex due to the non-convexity of both the objective function and the QoSconstraints.

B. Iterative Reweighted-ℓ2 Algorithm

We use the MM algorithm and the Lagrangian duality theoryto solve problem (9) with closed-form solutions. Specifically,this approach generates the iterates {v[n]} by successivelyminimizing upper bounds Q(v; v[n]) of the objective functiongp(v; ϵ). We adopt the following upper bounds to approximatethe smoothed ℓp-norm g(v; ϵ) by leveraging the results of theexpectation-maximization (EM) algorithm [44].

Proposition 1: Given the value of v[n] at the n-th iteration,an upper bound for the objective function gp(v; ϵ) can beconstructed as follows:

Q(v;ω[n]) :=L∑

l=1

ω[n]l ∥vl∥2

2, (10)

where

ω[n]l = pνl

2

[∥∥∥v[n]l

∥∥∥2

2+ ϵ2

] p2 −1

, ∀l = 1, . . . , L . (11)

Proof: The proof is mostly based on [15, Proposition 1].

Therefore, at the n-th iteration, we need to solve thefollowing optimization problem:

P [n]GSBF: minimize

v

L∑

l=1

ω[n]l ∥vl∥2

2

subject to|hH

k vk|2∑

i =k |hHk vi |2 + σ 2

k

≥ γk, ∀k, (12)

which is non-convex due to the non-convex QoS constraints.Although the QoS constraints can be reformulated as second-order cone constraints as in [7], in this paper, we leveragethe Lagrangian duality theory to obtain explicit structures ofthe optimal solution to problem P [n]

GSBF, thereby reducingthe computational complexity and further aiding the largesystem analysis in next section. The success of applying theLagrangian duality approach is based on the fact that strongduality holds for problem P [n]

GSBF [45], i.e., the gap betweenthe primal optimal objective and the dual optimal objective iszero.

1) Simple Solution Structures: As the strong duality holdsfor problem P [n]

GSBF [45], we solve it using duality theory.Specifically, let λk/(N L) ≥ 0 denote the Lagrange multiplierscorresponding to the QoS constraints in problem P [n]

GSBF.Define the Lagrangian function L({vk},λ) with λ = {λi } as

L({vk},λ) =K∑

k=1

λk

L N

⎝∑

i =k

|hHk vi |2 + σ 2

k − 1γk

|hHk vk |2

+K∑

k=1

vHk Q[n]vk, (13)

where Q[n] ∈ CL N×L N is a block diagonal matrix with thescaled identity matrix ω[n]

l IN as the l-th main diagonal blocksquare matrix.

To find the optimal vk’s, we take the gradient of theLagrangian function L(v,λ) with respect to vk and set it tozero, which implies

Q[n]vk +∑

i =k

λk

L Nhi hH

i vk − λk

L NγkhkhH

k vk = 0. (14)

By adding λk/(L N) × hkhHk vk to both sides of (14), we have

(

Q[n] +K∑

i=1

λi

L Nhi hH

i

)

vk = λk

L N

(1 + 1

γk

)hkhH

k vk, (15)

which implies vk =(

Q[n] + ∑Ki=1 λi/(L N )hi hH

i

)−1hk ×

(λk/(L N )) (1 + 1/γk) hHk vk . As (λk/L N ) (1 + 1/γk) hH

k vk isa scalar, the optimal vk must be parallel to the beamformingdirection

vk =(

Q[n] +K∑

i=1

λi

L Nhi hH

i

)−1

hk, ∀k. (16)

Therefore, the optimal beamforming vectors v1, . . . , vK canbe written as

vk =√

pk

L Nvk

∥vk∥2, ∀k, (17)

where pk denotes the optimal beamforming power. Observingthat the beamforming powers pk’s need to satisfy the SINRconstraints with equality [20], i.e.,

pk

γk L N

|hHk vk |2

∥vk∥22

−∑

i =k

pi

L N

|hHk vi |2

∥vi∥22

= σ 2k , ∀k, (18)

the optimal powers thus can be obtained via solving thefollowing linear equation:

⎢⎣p1...

pK

⎥⎦ = M−1

⎢⎣σ 2

1...σ 2

K

⎥⎦, (19)

where

[M]i j =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

1γi L N

|hHi vi |2

∥vi∥22

, i = j,

− 1L N

|hHi v j |2

∥v j∥22

, i = j.

Page 6: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2516 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

Here, [M]i j denotes the (i, j)-th element of the matrixM ∈ RK×K .

To find the optimal λ-parameter, by multiplying both sides

by hHk

(Q[n] + ∑K

i=1λi

L N hi hHi

)−1in (15), we obtain hH

k vk as

λk

L N

(1 + 1

γk

)hH

k

(

Q[n] +K∑

i=1

λi

L Nhi hH

i

)−1

hkhHk vk, (20)

which implies

λk = L N

⎣(

1 + 1γk

)hH

k

(

Q[n] +K∑

i=1

λi

L Nhi hH

i

)−1

hk

⎦−1

.

(21)

These fixed-point equations can be computed using iterativefunction evaluation.

Based on (17), (19) and (21), we obtain the squaredℓ2-norm of the optimal solution to problem P [n]

GSBF asfollows:

∥∥∥v[n]l

∥∥∥2

2=

K∑

k=1

vHk Qlkvk, (22)

where Qlk ∈ CN L×N L is a block diagonal matrix with theidentity matrix IN as the l-th main diagonal block squarematrix and zeros elsewhere. Note that the optimal powers pk’s(19) and Lagrangian multipliers λk ’s (21) should depend onthe weights ω[n]

l ’s (11) at each iteration n.The instantaneous CSI based iterative reweighted-ℓ2

algorithm for group sparse beamforming is presentedin Algorithm 1. We have the following result for its perfor-mance and convergence.

Algorithm 1 Instantaneous CSI Based Iterative Reweighted-ℓ2Algorithm for Problem PGSBF

input: Initialize ω[0] = (1, . . . , 1); I (the maximum numberof iterations)Repeat

1) Compute the squared ℓ2-norm of the solution to problemP [n]

GSBF, ∥v[n]1 ∥2

2, . . . , ∥v[n]L ∥2

2, using (17), (19), (21) and (22).2) Update the weights ω[n+1]

l using (11).Until convergence or attain the maximum iterations and returnoutput.output: RRH ordering criterion θ (7).

Theorem 1: Let {v[n]}∞n=1 be the sequence generated by theiterative reweighted-ℓ2 algorithm. Then, every limit point v of{v[n]}∞n=1 has the following properties:

1) v is a KKT point of problem PGSBF (9);2) gp(v[n]; ϵ) converges monotonically to gp(v⋆; ϵ) for

some KKT point v⋆.Proof: Please refer to Appendix A for details.

Compared with the algorithm for the smoothed-ℓp mini-mization in [15], the main novelty of the proposed iterativereweighted-ℓ2 algorithm is revealing the explicit structures ofthe solutions at each iteration in Algorithm 1. Instead of using

the interior-point algorithm to solve the convex subproblemsat each iteration [46], Algorithm 1 with closed-form solutionshelps reduce the computational cost. The proposed iterativereweighted-ℓ2 algorithm can only guarantee to converge toa KKT point, which may be a local minimum or the otherstationary point (e.g., a saddle point and local maximum).

Remark 2: The main contributions of the developedAlgorithm 1 include finding the closed-form solutions in theiterations, in comparison with the work [15] using the interior-point algorithm, and proving the convergence of the low-complexity closed-form iterative algorithm for the non-convexgroup sparse inducing problem PGSBF (9), instead of theconvex coordinated beamforming problem [45].

Unfortunately, computing the RRH ordering criterion θl ’sin (7) requires to run Algorithm 1 for each channel realization,which brings a heavy computation burden. In next section, wewill find deterministic approximations for θl’s to determine theRRH ordering only based on statistical CSI, which changesmuch more slowly than instantaneous channel states, and thusrequires less frequent update.

IV. GROUP SPARSE BEAMFORMING VIA

LARGE SYSTEM ANALYSIS

In this section, we present the large system analysis forthe iterative reweighted-ℓ2 algorithm in Algorithm 1, therebyenabling RRH ordering only based on statistical CSI. In thisway, the ordering criterion will change only when the long-term channel attenuation is updated, and thus it can fur-ther reduce the computational complexity of group sparsebeamforming. The main novelty of this section is providingthe closed-forms for the asymptotic analysis of the optimalLagrangian multipliers based on the recent results in [27],thereby providing explicit expressions for the asymptotic RRHordering results.

A. Deterministic Equivalent of Optimal Parameters

In this subsection, we provide asymptotic analysis for theoptimal beamforming parameters pk’s (19) and λk ’s (21) whenN → ∞. In a Cloud-RAN with distributed RRHs, the channelcan be modeled as hk = &

1/2k gk,∀k, where gk ∼ CN (0, IN L )

is the small-scale fading and ,k = diag{dk1, . . . , dkL} ⊗ INwith dkl as the path-loss from RRH l to MU k. With this chan-nel model, we have the following result for the deterministicequivalent of the λ-parameter.

Lemma 1 (Asymptotic Results for λ-Parameter): Assume

0 < lim inf N→∞ K/N ≤ lim supN→∞

K/N < ∞.

Let {dkl} and {γk} satisfy lim supN maxk,l {dkl} < ∞and lim supN maxk γk < ∞, respectively. We have

max1≤k≤K |λk − λ◦k |

N→∞−→ 0 almost surely, where λ◦k is

given by

λ◦k = γk

(1L

L∑

l=1

dklηl

)−1

. (23)

Page 7: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2517

Here, ηl is the unique solution of the following set ofequations:

ηl =(

1N L

K∑

i=1

dil1L

∑Lj=1 di jη j

γi

1 + γi+ ω[n]

l

)−1

. (24)

Proof: Please refer to Appendix B for details.Based on the above results, we further have the following

asymptotic result for the optimal powers pk’s.Lemma 2 (Asymptotic Results for Optimal Powers): Let

. ∈ CK×K be such that

[.]k,i := 1N L

γi

(1 + γi )2

ψ ′ik

ψ◦2i

. (25)

If and only if lim supK ∥.∥2 < 1, then maxk |pk − p◦k |

N→∞−→ 0almost surely, where p◦

k is given by

p◦k = γk

ψ ′k

ψ◦2k

(τk

(1 + γk)2 + σ 2k

). (26)

Here ψ◦k , ψ ′

k and ψ ′ik are given as follows:

ψ◦k = 1

L

L∑

l=1

dklηl , (27)

and

ψ ′k = 1

L

L∑

l=1

dklη2l + 1

N L

K∑

j=1

λ◦2j ψ

′j

(1 + γ j )2

1L

L∑

l=1

dild j lη2l , (28)

and

ψ ′ik = 1

L

L∑

l=1

dil dklη2l +

1N L

K∑

j=1

λ◦2j ψ

′j k

(1+γ j)2

1L

L∑

l=1

dild j lη2l , (29)

respectively; and τ = [τ1, . . . , τK ]T is given as

τ = σ 2 (IK −.)−1 δ, (30)

where δ = [δ1, . . . , δK ]T with

δk = 1N L

K∑

i=1

γiψ ′

ik

ψ◦2i

. (31)

Proof: Please refer to Appendix C for details.

B. Statistical CSI Based Group SparseBeamforming Algorithm

Based on the deterministic equivalents of the optimalbeamforming parameters λk’s and pk’s in Lemma 1 andLemma 2, respectively, we have the following theorem onthe squared ℓ2-norm of the solution to problem P [n]

GSBF, i.e.,∥v[n]

1 ∥22, . . . , ∥v[n]

L ∥22.

Theorem 2 (Asymptotic Results for the Beamformers):At the n-th iteration, for the squared ℓ2-norm of thesolution to problem P [n]

GSBF, ∥v[n]1 ∥2

2, . . . , ∥v[n]L ∥2

2, we have

maxl

∣∣∣∥v[n]l ∥2

2 − χl

∣∣∣ N→∞−→ 0 almost surely with

χl = 1N L

K∑

k=1

p◦kψkl

ψ ′k

, (32)

where ψkl = 1N L dklη2

l + 1N L

∑Kj=1

λ◦2j ψ jl

(1+γ j )21L

∑Ll=1 dil

d j lη2l .Proof: Please refer to Appendix D for details.

We thus have the following deterministic equivalent of theweights ω[n]

l ’s (11):

ω[n]l = pνl

2

[χl + ϵ2

] p2 −1

, ∀l = 1, . . . , L . (33)

Note that the asymptotic results of the powers p◦k ’s (26)

and Lagrangian multipliers λ◦k ’s (23) should depend on the

weights ω[n]l ’s (33) at each iteration. Based on (7), we have

the following asymptotic result for the RRH ordering criterion:

θl = κl χ⋆l , (34)

where χ ⋆l is the deterministic equivalent of the squaredℓ2-norm of the solution to problem PGSBF using the iterativereweighted-ℓ2 algorithm. Therefore, the RRH with a smallerθl will have a higher priority to be switched off. This order-ing criterion will change only when the long-term channelattenuation is updated. Note that global instantaneous CSI isstill needed in stage II to find the active RRHs, i.e., solving asequence of convex feasibility problems F (A[i]) (8).

The statistical CSI based iterative reweighted-ℓ2 algorithmfor group sparse beamforming is presented in Algorithm 2.

Algorithm 2 Statistical CSI Based Iterative Reweighted-ℓ2Algorithm for Problem PGSBF

input: Initialize ω[0] = (1, . . . , 1); I (the maximum numberof iterations)Repeat

1) Compute deterministic equivalent of the squared ℓ2-normof the solution to P [n]

GSBF, ∥v[n]1 ∥2

2, . . . , ∥v[n]L ∥2

2, using χ [n]l

(32).2) Update the weights ω[n+1]

l using (33).Until convergence or attain the maximum iterations and returnoutput.output: Asymptotic result of the RRH ordering criterion θ(34).

Remark 3: Although stage 2 and stage 3 still requireinstantaneous CSI, the RRH ordering criterion computed byAlgorithm 2 in stage 1 is only based on statistical CSI and thuswill be updated only when the long-term channel propagationis changed. Therefore, Algorithm 2 serves the purpose offurther reducing the computation complexity of Algorithm 1for RRH ordering. A promising future research direction is toselect RRHs in stage 2 only based on statistical CSI. However,the main challenge is the infeasibility issue if only statisticalCSI is available, as discussed in Remark 1.

V. SIMULATION RESULTS

In this section, we will simulate the proposed algorithmsfor network power minimization for Cloud-RANs. In all therealizations, we only account the channel realizations makingthe original problem P feasible. If problem P is infeasible,further works on user admission are required [15]. We set

Page 8: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2518 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

Fig. 2. Convergence of the iterative reweighted-ℓ2 algorithm.

p = 1, ϵ = 10−3 and ω[0] = (1, . . . , 1) for all thesimulations. The proposed iterative reweighted-ℓ2 algorithms(Algorithm 1 and Algorithm 2) will terminate if either thenumber of iterations exceeds 30 or the difference between theobjective values of consecutive iterations is less than 10−3.

A. Convergence of the Iteratively Reweighted-ℓ2 Algorithm

Consider a network with L = 5 30-antenna RRHs and5 single antenna MUs uniformly and independently distributedin the square region [−2000, 2000] × [−2000, 2000] meters.Fig. 2 shows the convergence of the iterative reweighted-ℓ2algorithm in different scenarios with different realizations ofRRH and MU positions. For each scenario, the numericalresults are obtained by averaging over 100 small-scale fadingrealizations. The large-scale fading (i.e., statistical CSI) israndomly generated and fixed during simulations. In the twocurves of each scenario, Algorithm 1 and Algorithm 2 areapplied for Stage I of group sparse beamforming, respec-tively. It demonstrates that the large system analysis basedAlgorithm 2 provides accurate approximation even in a smallsystem. Fast convergence is observed in the simulated setting.

B. Network Power Minimization

Consider a network with L = 5 10-antenna RRHs and6 single antenna MUs uniformly and independently distributedin the square region [−1000, 1000] × [−1000, 1000] meters.The relative transport link power consumption are set to bePc

l = (5.6 + 2l)W, l = 1, . . . , L. We average over 200 small-scale channel realizations. Fig. 3 demonstrates the averagenetwork power consumption using different algorithms. Thisfigure shows that the proposed iterative reweighted-ℓ2 algo-rithm outperforms the weighted mixed ℓ1/ℓ2-norm algorithmin [7]. Furthermore, it illustrates that the large system analysisprovides accurate approximations for the iterative reweighted-ℓ2 algorithm in finite systems with reduced computationoverhead. In Table I, for the simulated scenarios, we seethat the iterative reweighted-ℓ2 algorithms achieve similarperformance with different values of parameter p and the large

Fig. 3. Average network power consumption versus target SINR withdifferent algorithms.

TABLE I

NETWORK POWER CONSUMPTION USING ALGORITHM 1AND ALGORITHM 2 WITH DIFFERENT PARAMETERS p

system analysis provides accurate approximations. Althoughthe simulated results demonstrate that the performance isrobust to different values of parameter p, it is very interestingto theoretically identify the typical scenarios, where smallervalues of parameter p will yield much lower network powerconsumption.

VI. CONCLUSIONS AND DISCUSSIONS

In this paper, we developed an enhanced three-stage groupsparse beamforming framework with reduced computationoverhead in Cloud-RANs. In particular, closed-form solutionsfor the group sparse optimization problem were obtained bydeveloping the iterative reweighted-ℓ2 algorithm based on theMM algorithm and Lagrangian duality theory. This is the firsteffort to reduce the computation cost for RRH ordering inthe first stage of group sparse beamforming. Based on thedeveloped structured iterative algorithm, we further provideda large system analysis for the optimal Lagrangian multipliersat each iteration via random matrix theory, thereby computingthe RRH ordering criterion only based on the statistical CSI.This is the second effort to enable computation scalability forRRH ordering in the first stage.

Several future directions of interest are listed as follows:• Although computing the RRH ordering criterion in the

first stage with Algorithm 2 only needs statistical CSI,the second stage still requires global instantaneous CSI

Page 9: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2519

to find the active RRHs by solving a sequence of convexfeasibility problems. It is thus particularly interesting toinvestigate efficient algorithms to select the active RRHsonly based on the statistical CSI.

• It is desirable to establish the optimality of the iterativereweighted-ℓ2 algorithm for the network power mini-mization problem P . However, considering the compli-cated problem structures of problem P , this becomeschallenging. It is also interesting to apply the developedstatistical CSI based iterative reweighted-ℓ2 algorithm formore complicated network optimization problems, e.g.,the wireless caching problem [33], [34], the computationoffloading problem [36], and beamforming problems withCSI uncertainty [32].

APPENDIX APROOF OF THEOREM 1

1) Observing that the phases of vk will not changethe objective function and constraints of problemPGSBF, the QoS constraints thus can be equivalentlytransformed to the second-order cone constraints:

C ={

v :√∑

i =k |hHk vi |2 + σ 2

k ≤ R(hHk vk)/γk,∀k

}, which

are convex cones. The KKT points of problem PGSBF shouldsatisfy:

0 ∈ ∇vgp(v; ϵ) + NC(v), (35)

where NC(v) is the normal cone of the second-order cone atpoint v [47]. We shall show that any convergent subsequence{v[nk ]}∞k=1 of {v[n]}∞n=1 satisfies (35). Specifically, let v[nk ] → vbe one such convergent subsequence with

limk→∞

v[nk+1] = limk→∞

v[nk ] = v. (36)

Based on the strong duality result for problem P [n]GSBF, the fol-

lowing KKT condition holds at v[nk+1]:

0 ∈ ∇v Q(v[nk+1];ω[nk ]) + NC(v[nk+1]). (37)

Based on (36) and (11), we havelimk→∞ ∇v Q(v[nk+1];ω[nk ]) = ∇vgp(v; ϵ). Furthermore,based on [47, Proposition 6.6] and (36), we havelim supv[nk+1]→v NC(v[nk+1]) = NC(v). Therefore, by takingk → ∞ in (37), we have 0 ∈ ∇v Q(v; ω) + NC(v), whichindicates that v is a KKT point of problem PGSBF. We thuscomplete the proof.

2) We first conclude the following fact:

gp(v[n+1]; ϵ)= Q(v[n+1];ω[n]) + gp(v[n+1]; ϵ) − Q(v[n+1];ω[n])

≤ Q(v[n+1];ω[n]) + gp(v[n]; ϵ) − Q(v[n];ω[n])≤ Q(v[n];ω[n]) + gp(v[n]; ϵ) − Q(v[n];ω[n])

= gp(v[n]; ϵ), (38)

where the first inequality is based on the fact that function(gp(v; ϵ) − Q(v;ω[n])) attains its maximum at v = v[n]

[15, Proposition 1], and the second inequality followsfrom (37). Furthermore, as gp(v; ϵ) is continuous and C iscompact, the limit of the sequence gp(v[n]; ϵ) is finite. Basedon the convergence results in 1), we thus complete the proof.

APPENDIX BPROOF OF LEMMA 1

The proof technique here is mainly based on the work [26].However, we have to modify their proof as we have differentweighting matrices Q[n] in vectors vk (16) at the n-th iterationand such modification is non-trivial. Based on the followingmatrix inversion lemma [48], xH(U + cxxH)−1 = xHU−1

1+cxHU−1x,

(21) can be rewritten as

γk

λk= 1

L NhH

k

⎝ 1L N

i =k

λi hi hHi + Q

⎠−1

hk, (39)

which can be further rewritten as

γkρk = λ◦k

L NhH

k

⎝ 1L N

i =k

λ◦i

ρihi hH

i + Q

⎠−1

hk, (40)

where ρk = λ◦k/λk .

Assume that 0 ≤ ρ1 ≤ ρ2 ≤ · · · ≤ ρK . From (40), replacingρi with ρK and using monotonicity arguments, we have

γKρK ≤ λ◦K

L NhH

K

⎝ 1L N

i =K

λ◦i

ρKhi hH

i + Q

⎠−1

hK , (41)

or, equivalently

γK ≤ λ◦K

L NhH

K

⎝ 1L N

i =K

λ◦i hi hH

i + ρK Q

⎠−1

hK . (42)

Assume now that ρK is infinitely often larger than (1+ℓ) withℓ being some positive value [27]. Let us restrict ourselves tosuch a subsequence. From (42), using monotonicity argumentswe obtain

γK ≤ λ◦K

L NhH

K

⎝ 1L N

i =K

λ◦i hi hH

i + (1 + ℓ)Q

⎠−1

hK . (43)

Define

mk(ℓ)=1

L NhH

k

⎝ 1L N

i =k

λ◦i hi hH

i + (1 + ℓ)Q

⎠−1

hk . (44)

Now we investigate the deterministic equivalents for mk(ℓ).We first introduce the following important lemma:

Lemma 3 ([23, Lemma 14.2]): Let A1, A2, . . . , with AN ∈CN×N be a series of random matrices generated by theprobability space (4,F , P) such that, for ω ∈ A ⊂ 4, withP(A) = 1, ∥AN (ω)∥ < K (ω) < ∞, uniformly on N . Letx1, x2, . . . , with xN ∈ CN , be random vectors of i.i.d. entrieswith zero mean, variance 1/N , and the eighth-order momentof order O(1/N4), independent of AN . Then

xHN AN xN − 1

NTr(AN )

N→∞−→ 0, (45)

almost surely.

Page 10: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2520 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

Therefore, based on Lemma 3, we have

mK (ℓ)− 1L N

Tr,K

⎝ 1L N

i =K

λ◦i hi hH

i +(1+ℓ)Q

⎠−1

N→∞−→ 0.

(46)

We further introduce the following important lemma toderive the deterministic equivalent for the mk(l).

Lemma 4 ([24, Th. 1]): Let BN = XHN XN + SN with SN ∈

CN×N Hermitian nonnegative definite and XN ∈ Cn×N

random. The i -th column xi of XHN is xi = 5i yi , where

the entries of yi ∈ Cri are i.i.d. of zero mean, variance1/N and have eighth-order moment of order O(1/N4). Thematrices 5i ∈ CN×ri are deterministic. Furthermore, let,i = 5i5H

i ∈ CN×N and define QN ∈ CN×N deterministic.Assume lim supN→∞ sup1≤i≤n ∥,i∥ < ∞ and let QN haveuniformly bounded spectral norm (with respect to N). Define

mBN ,QN (z) := 1N

TrQN (BN − zIN )−1. (47)

Then, for z ∈ C\R+, as n, N grow large with ratios βN,i :=N/ri and βN := N/n such that 0 < lim inf N βN ≤lim supN βN < ∞ and 0 < lim inf N βN,i ≤ lim supN βN,i <∞, we have that

mBN ,QN (z) − m◦BN ,QN

(z)N→∞−→ 0 (48)

almost surely, with m◦BN ,QN

(z) given by

m◦BN ,QN

(z)= 1N

TrQN

⎝ 1N

n∑

j=1

, j

1+eN, j (z)+SN −zIN

⎠−1

,

(49)

where the functions eN,1(z), . . . , eN,n(z) from the uniquesolution of

eN,i (z)=1N

Tr,i

⎝ 1N

n∑

j=1

, j

1+eN, j (z)+SN −zIN

⎠−1

, (50)

which is the Stieltjes transform of a nonnegative finite measureon R+. Moreover, for z < 0, the scalars eN,1(z), . . . , eN,n(z)are unique nonnegative solutions to (50).

Therefore, based on Lemma 4, we have

ψK (ℓ) − ψ◦K (ℓ)

N→∞−→ 0 (51)

almost surely with ψ◦K (ℓ) being the unique positive solution

to

ψ◦K (ℓ) = 1

N LTr(,K A(ℓ)) (52)

where

A(ℓ) =(

1N L

K∑

i=1

λ◦i,i

1 + λ◦i ψ

◦i (ℓ)

+ (1 + ℓ)Q

)−1

. (53)

From (51), recalling (43) yields

limK→∞

inf ψ◦K (ℓ) ≥ γK /λ◦

K (54)

Using the fact that ψ◦K (ℓ) is a decreasing function of ℓ, it can

be proved [23] that for any ℓ > 0, we have

limK→∞

supψ◦K (ℓ) < γK /λ◦

K . (55)

However, this is against the former condition and creates acontradiction on the initial hypothesis that ρK < 1+ℓ infinitelyoften. Therefore, we must admit that ρK ≤ 1 + ℓ for alllarge values of K . Reverting all inequalities and using similararguments yields ρK ≥ 1 − ℓ for all large values of K .

We eventually obtain that 1 − ℓ ≤ ρK ≤ 1 + ℓ fromwhich we may write max |ρK − 1| ≤ ℓ for all large valuesof K . Taking a countable sequence of ℓ going to zero yieldsmax |ρK − 1| → 0 from which using ρK = λ◦

K /λK andassuming limK→∞ sup γK < ∞, we obtain

max |λK − λ◦K | N→∞−→ 0, (56)

almost surely. From (40), we have γK /λ◦K = ψ◦

K (0) = ψ◦K .

Therefore, we obtain λ◦K = γK /ψ◦

K , where ψ◦K is the solution

of the following equation:

ψ◦K = 1

N LTr,K

(1

N L

K∑

i=1

,i

ψ◦K

γi

1 + γ i+ Q

)−1

. (57)

Following the same steps for k = 1, . . . , K yields thefollowing desired result:

λ◦k = γk/ψ

◦k , ∀k. (58)

As ,k = diag{dk1, . . . , dkL} ⊗ IN , we have

ψ◦k = 1

L NTr(,kA) = 1

L

L∑

l=1

dklηl , (59)

where A = A(0) in (53) and {ηl} is the unique positive solutionto the following set of equations

ηl =(

1N L

K∑

i=1

dil

1/L∑L

j=1 dkjη j

γi

1 + γi+ ω[n]

l

)−1

. (60)

We thus complete the proof for the asymptotic results of theoptimal Lagrange multipliers in Lemma 1.

APPENDIX CPROOF OF LEMMA 2

The SINR for MU k (2) can be rewritten as

sinrk =pk

N L|hH

k vk |2∥vk∥2

2∑

i =kpi

N L|hH

k vi |2∥vi∥2

2+ σ 2

k

, (61)

where vk =(∑K

i=1λ◦

iL N hi hH

i + Q1/2)−1

hk . The interferencepart of the denominator of 7k in (61) can be rewritten as

1N L

i =k

pi|hH

k vi |2∥vi∥2

2= 1

N L

i =k

pi hHk

(vi vH

i

∥vi∥22

)

hk

= 1N L

hHk V

(1

N LH[k]P[k]H[k]H

)Vhk

(62)

Page 11: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2521

with V =(

1N L

∑Ki=1 λ

◦i hi hH

i + Q)−1

, H[k] :=[h1, . . . , hk−1, hk+1, . . . , hK ] ∈ CN L×(K−1) and P[k] :=diag

{p1

1N L ∥v1∥2

2, . . . , pk−1

1N L ∥vk−1∥2

2, pk+1

1N L ∥vk+1∥2

2, . . . , pK

1N L ∥vK ∥2

2

}.

In order to eliminate the dependence between hk and V,rewrite (62) as

1N L

hHk V

(1

N LH[k]P[k]H[k]H

)Vhk

= 1N L

hHk V[k]

(1

N LH[k]P[k]H[k]H

)Vhk

+ 1N L

hHk

(V − V[k]

)(1

N LH[k]P[k]H[k]H

)Vhk, (63)

where V[k] =(

1N L

∑i =k λ

◦i hi hH

i + Q)−1

. Using the resolvent

identity [48] (i.e., U−1 − V−1 = −U−1(U − V)V−1 with Uand V as two invertible complex matrices of size N × N),we have

V − V[k] = −V(V−1 − V[k]−1)V[k]. (64)

Then, observing that V−1 − V[k]−1 = λ◦k

N L hkhHk . From (63),

one gets

1N L

hHk V

(1

N LH[k]P[k]H[k]H

)Vhk

= 1N L

hHk V[k]

(1

N LH[k]P[k]H[k]H

)Vhk

− λk

N LhH

k Vhk

[1

N LhH

k V[k](

1N L

H[k]P[k]H[k]H)

Vhk

].

(65)

Lemma 5 ([24, Lemma 7]): Let U, V,, ∈ CN×N be ofuniformly bounded spectral norm with respect to N andlet V be invertible. Further, define x := ,1/2z and y :=,1/2q, where z, q ∈ CN have i.i.d. complex entries of zeromean, variance 1/N , and finite eighth-order moment and bemutually independent as well as independent of U, V. Definec0, c1, c2 ∈ R+ such that c0c1 − c2

2 ≥ 0 and let µ :=1N Tr,V−1 and µ′ := 1

N Tr(,UV−1). Then, we have

xHU(V + c0xxH + c1yyH + c2xyH + c2yxH)−1x

− µ′(1 + c1µ)

(c0c1 − c22)µ

2 + (c0 + c1)µ + 1N→∞−→ 0 (66)

almost surely. Furthermore,

xHU(V + c0xxH + c1yyH + c2xyH + c2yxH)−1y

− c2µµ′

(c0c1 − c22)µ

2 + (c0 + c1)µ + 1N→∞−→ 0 (67)

almost surely.Therefore, applying Lemma 5, we obtain that

1N L

hHk V[k]

(1

N LH[k]P[k]H[k]H

)Vhk − µ′

1 + λ◦kµ

N→∞−→ 0

(68)

almost surely, where µ = 1N L Tr

(,kV[k]) and µ′ =

1N L Tr

( 1N L P[k]H[k]HV[k],kV[k]H[k]). Furthermore, we have

λ◦k

N LhH

k Vhk

[1

N LhH

k V[k](

1N L

H[k]P[k]H[k]H)

Vhk

]

− λ◦kµµ′

(1 + λ◦kµ)2

N→∞−→ 0 (69)

almost surely. Based on the results in (65), (68), (69), we have

1N L

hHk V

(1

N LH[k]P[k]H[k]H

)Vhk − µ′

(1 + λ◦kµ)2

N→∞−→ 0

(70)

almost surely.From Lemma 1, we have µ − ψ◦

kN→∞−→ 0 almost surely.

Applying [24, Lemma 6], we have

1N L

Tr(

1N L

P[k]H[k]HV[k],kV[k]H[k])

− 1N L

Tr(

1N L

P[k]H[k]HV,kVH[k])

N→∞−→ 0 (71)

almost surely. We further rewrite

1N L

Tr(

1N L

P[k]H[k]HV,kVH[k])

= 1N L

i =k

pi

1N L hH

i V,kVhi1

N L hHi V2hi

. (72)

Applying [24, Lemma 1, 4 and 6], we obtain almost surely

1N L

hHi V,kVhi −

1N L Tr(,iV,kV)

[1 + 1

N L λ◦i Tr(,i V)

]2 → 0. (73)

To derive a deterministic equivalent for 1N L tr (,i V,kV), we

write1

N LTr (,i V,kV) = 1

N L∂

∂zTr

(,i

(V−1 − z,k

)−1)∣∣∣∣

z=0.

(74)

Observe now that [24, Theorem 1]

Tr(,i

(V−1 − z,k

)−1)

− ψik(z) → 0, (75)

almost surely, where ψik (z) is given by ψik (z) =1

N L Tr(,i Tk(z)) and Tk(z) is computed as

Tk(z) =

⎝ 1N L

K∑

j=1

λ◦j, j

1 + λ◦jψ j k(z)

+ Q − z,k

⎠−1

. (76)

By differentiating along z, we have ψ ′ik(z) = 1

N L Tr(,i T′k(z)),

where T′k(z) = ∂Tk(z)

∂z is given by

T′k(z) = Tk(z)

⎝ 1N L

K∑

j=1

λ◦2j ψ

′j k(z), j

(1 + λ◦

kψ j k(z))2 +,k

⎠ Tk(z).

(77)

Setting z = 0 yields

ψ ′ik (0) = 1

N LTr(,i T′

k(0)), (78)

Page 12: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2522 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

where

T′k(0) = Tk(0)

⎝ 1N L

K∑

j=1

λ◦2j ψ

′j k(0),k

(1 + γk)2 +,k

⎠ Tk(0). (79)

In writing the above result, we have taken into account thatA = T j (0), ψ j k(0) = 1

N L Tr(, j Tk(0)) = 1N L Tr(, j A) = ψ◦

jand γ j = λ◦

jψ◦j . Plugging (79) into (78) and neglecting the

functional dependence from z = 0, ψ ′k = [ψ ′

1k . . . ,ψ ′K k]T is

found as the unique solution of

ψ ′ik = 1

N LTr

⎝,i A

⎝ 1N L

K∑

j=1

λ◦2j ψ

′j k, j

(1 + γ j )2 +,k

⎠ A

= 1N L

Tr(,i A,kA)

+ 1N L

K∑

j=1

λ◦2j ψ

′j k

(1 + γ j )2

1N L

Tr(,i A, j A

). (80)

Observing that

1N L

Tr(,i A, j A) = 1L

L∑

l=1

dild j lη2l . (81)

Let ψ ′k = [ψ ′

1k, . . . ,ψ′K k ]T ∈ CK and J = [Ji j ] ∈ CK×K

with

Ji j =λ◦2

j

N L(1 + γ j )2

(1L

L∑

l=1

dil d j lη2l

)

. (82)

We can rewrite the above system of equations in the compactform as

ψ ′k = ck + Jψ ′

k, (83)

where ck = [cik] with cik = 1L

∑Ll=1 dil d j lη2

l .On the basis of above results, using 1

N L λ◦i tr(,iA)−γi → 0,

we eventually obtain that

1N L

hHi V,kVhi − ψ ′

ik

(1 + γi )2N→∞−→ 0 (84)

almost surely. Following similar arguments of above yields

1N L

hHi V2hi − ψ ′

i

(1 + γi )2N→∞−→ 0 (85)

almost surely with ψ ′ = [ψ ′1, . . . ,ψ

′K ]T = (IK −J)−1c where

c ∈ CK has elements [c]i = 1L

∑Ll=1 dliη2

l .Putting the results in (84) and (85) together, we obtain

almost surely

1N L

i =k

pi

1N L hH

i V,kVhi1

N L hHi V2hi

− 1N L

i =k

piψ ′

ik

ψ ′i

N→∞−→ 0. (86)

The deterministic equivalent of the numerator of the SINRin (61) is now easily obtained as

pk| 1

N L hHk vk|2

1N L hH

k V2hk− pk

ψ◦2k

ψ ′k

N→∞−→ 0, (87)

since 1N L hH

k vk − ψ◦k

1+γk

N→∞−→ 0 and 1N L hH

k V2hk − ψ ′k

(1+γk)2N→∞−→

0 almost surely.

Therefore, the deterministic equivalent SINR is given by

sinr◦k = ψ◦2k

ψ ′k

pk

I ◦k + σ 2

k

, (88)

where

I ◦k := 1

(1 + γk)2

(1

N L

K∑

i=1

piψ ′

ik

ψ ′i

)

. (89)

From (88), it follows that p◦k such that sinr◦k = γk is

obtained as

p◦k = γk

ψ ′k

ψ◦2k

(I ◦k + σ 2

k ), (90)

which can be further rewritten as

p◦kψ◦2

k

ψ ′k

1γk

= I ◦k + σ 2

k = 1(1 + γk)2

(1

N L

K∑

i=1

p◦iψ ′

ik

ψ ′i

)

+ σ 2k .

(91)

Therefore, the deterministic equivalent for the powers isgiven by

⎢⎣p◦

1...

p◦K

⎥⎦ = M◦−1

⎢⎣σ 2

1...σ 2

K

⎥⎦,

where

[M◦]i j =

⎧⎪⎪⎪⎨

⎪⎪⎪⎩

1γi

ψ◦2i

ψ ′i

, i = j,

− 1

(1 + γ 2i )L N

ψ ′j i

ψ ′j, i = j.

We thus complete the proof for the asymptotic results foroptimal powers in Lemma 2.

APPENDIX DPROOF OF THEOREM 2

We rewrite ∥vl∥22 as

∥vl∥22 =

K∑

k=1

vHk Qlkvk = 1

N L

K∑

k=1

pkvH

k Qlk vk

∥vk∥22

= 1N L

K∑

k=1

pkhH

k VQlkVhk

hHk V2hk

, (92)

where Qlk ∈ CN L×N L is a block diagonal matrix with theidentify matrix IN as the l-th main diagonal block squarematrix and zeros elsewhere.

Based on the similar arguments in Appendix C, we havethe deterministic equivalents

hHk VQlkVhk − ψkl

(1 + γk)2N→∞−→ 0, (93)

where

ψkl = 1N L

Tr(,kAQklA)

= 1N L

K∑

j=1

λ◦2j ψ j l

(1 + γ j )2

1N L

Tr(,i A, j A

). (94)

Page 13: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

SHI et al.: ENHANCED GROUP SPARSE BEAMFORMING FOR GREEN CLOUD-RAN: A RANDOM MATRIX APPROACH 2523

Observing that1

N LTr(,kAQklA) = 1

Ldklη

2l , (95)

we obtain the final result. We thus complete the proof for theasymptotic results for ∥v[n]

l ∥22’s in Theorem 2.

REFERENCES

[1] Y. Shi, J. Zhang, and K. B. Letaief, “Statistical group sparse beamform-ing for green Cloud-RAN via large system analysis,” in Proc. IEEE Int.Symp. Inf. Theory (ISIT), Jul. 2016, pp. 870–874.

[2] N. Bhushan et al., “Network densification: The dominant theme forwireless evolution into 5G,” IEEE Commun. Mag., vol. 52, no. 2,pp. 82–89, Feb. 2014.

[3] Y. Shi, J. Zhang, K. B. Letaief, B. Bai, and W. Chen, “Large-scale convexoptimization for ultra-dense cloud-RAN,” IEEE Wireless Commun.,vol. 22, no. 3, pp. 84–91, Jun. 2015.

[4] J. G. Andrews, X. Zhang, G. D. Durgin, and A. K. Gupta, “Are weapproaching the fundamental limits of wireless network densification?”IEEE Commun. Mag., vol. 54, no. 10, pp. 184–190, Oct. 2016.

[5] J. G. Andrews et al., “What will 5G be?” IEEE J. Sel. Areas Commun.,vol. 32, no. 6, pp. 1065–1082, Jun. 2014.

[6] M. Peng, Y. Li, J. Jiang, J. Li, and C. Wang, “Heterogeneous cloudradio access networks: A new perspective for enhancing spectraland energy efficiencies,” IEEE Wireless Commun., vol. 21, no. 6,pp. 126–135, Dec. 2014.

[7] Y. Shi, J. Zhang, and K. B. Letaief, “Group sparse beamforming forgreen cloud-RAN,” IEEE Trans. Wireless Commun., vol. 13, no. 5,pp. 2809–2823, May 2014.

[8] D. Wubben et al., “Benefits and impact of cloud computing on 5G signalprocessing: Flexible centralization through cloud-RAN,” IEEE SignalProcess. Mag., vol. 31, no. 6, pp. 35–44, Nov. 2014.

[9] D. Gesbert, S. Hanly, H. Huang, S. S. Shitz, O. Simeone, and W. Yu,“Multi-cell MIMO cooperative networks: A new look at interference,”IEEE J. Sel. Areas Commun., vol. 28, no. 9, pp. 1380–1408, Dec. 2010.

[10] M. Peng, C. Wang, V. Lau, and H. V. Poor, “Fronthaul-constrainedcloud radio access networks: Insights and challenges,” IEEE WirelessCommun., vol. 22, no. 2, pp. 152–160, Apr. 2015.

[11] S.-H. Park, O. Simeone, O. Sahin, and S. Shamai (Shitz), “Fron-thaul compression for cloud radio access networks: Signal processingadvances inspired by network information theory,” IEEE Signal Process.Mag., vol. 31, no. 6, pp. 69–79, Nov. 2014.

[12] S. Leyffer, Mixed Integer Nonlinear Programming, vol. 154. New York,NY, USA: Springer, 2012.

[13] E. J. Candès, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity byreweighted ℓ1 minimization,” J. Fourier Anal. Appl., vol. 14, nos. 5–6,pp. 877–905, 2008.

[14] I. Daubechies, R. DeVore, M. Fornasier, and C. S. Güntürk, “Iterativelyreweighted least squares minimization for sparse recovery,” Commun.Pure Appl. Math., vol. 63, no. 1, pp. 1–38, 2010.

[15] Y. Shi, J. Cheng, J. Zhang, B. Bai, W. Chen, and K. B. Letaief,“Smoothed L p-minimization for green cloud-RAN with user admissioncontrol,” IEEE J. Sel. Areas Commun., vol. 34, no. 4, pp. 1022–1036,Apr. 2016.

[16] Y. Shi, J. Zhang, B. O’Donoghue, and K. B. Letaief, “Large-scaleconvex optimization for dense wireless cooperative networks,” IEEETrans. Signal Process., vol. 63, no. 18, pp. 4729–4743, Sep. 2015.

[17] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributedoptimization and statistical learning via the alternating direction methodof multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1, pp. 1–122,Jan. 2011.

[18] B. O’Donoghue, E. Chu, N. Parikh, and S. Boyd, “Conic optimizationvia operator splitting and homogeneous self-dual embedding,” J. Optim.Theory Appl., vol. 169, no. 3, pp. 1042–1068, Jun. 2016.

[19] D. R. Hunter and K. Lange, “A tutorial on MM algorithms,” Amer.Statist., vol. 58, no. 1, pp. 30–37, 2004.

[20] W. Yu and T. Lan, “Transmitter optimization for the multi-antennadownlink with per-antenna power constraints,” IEEE Trans. SignalProcess., vol. 55, no. 6, pp. 2646–2660, Jun. 2007.

[21] B. Dai and W. Yu, “Sparse beamforming and user-centric cluster-ing for downlink cloud radio access network,” IEEE Access, vol. 2,pp. 1326–1339, Oct. 2014.

[22] A. M. Tulino and S. Verdú, “Random matrix theory and wirelesscommunications,” Found. Trends Commun. Inf. Theory, vol. 1, no. 1,pp. 1–182, 2004.

[23] R. Couillet and M. Debbah, Random Matrix Methods for WirelessCommunications. Cambridge, U.K.: Cambridge Univ. Press, 2011.

[24] S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, “Large systemanalysis of linear precoding in correlated MISO broadcast channelsunder limited feedback,” IEEE Trans. Inf. Theory, vol. 58, no. 7,pp. 4509–4537, Jul. 2012.

[25] R. Zakhour and S. V. Hanly, “Base station cooperation on the downlink:Large system analysis,” IEEE Trans. Inf. Theory, vol. 58, no. 4,pp. 2079–2106, Apr. 2012.

[26] L. Sanguinetti, R. Couillet, and M. Debbah, “Large system analysis ofbase station cooperation for power minimization,” IEEE Trans. WirelessCommun., vol. 15, no. 8, pp. 5480–5496, Aug. 2016.

[27] R. Couillet and M. McKay, “Large dimensional analysis and optimiza-tion of robust shrinkage covariance matrix estimators,” J. MultivariateAnal., vol. 131, pp. 99–120, Oct. 2014.

[28] J. Zhao, T. Q. S. Quek, and Z. Lei, “Coordinated multipoint transmissionwith limited backhaul data transfer,” IEEE Trans. Wireless Commun.,vol. 12, no. 6, pp. 2762–2775, Jun. 2013.

[29] M. Hong, R. Sun, H. Baligh, and Z. Q. Luo, “Joint base stationclustering and beamformer design for partial coordinated transmissionin heterogeneous networks,” IEEE J. Sel. Areas Commun., vol. 31, no. 2,pp. 226–240, Feb. 2013.

[30] S. Luo, R. Zhang, and T. J. Lim, “Downlink and uplink energyminimization through user association and beamforming in C-RAN,”IEEE Trans. Wireless Commun., vol. 14, no. 1, pp. 494–508,Jan. 2015.

[31] B. Dai and W. Yu, “Energy efficiency of downlink transmission strategiesfor cloud radio access networks,” IEEE J. Sel. Areas Commun., vol. 34,no. 4, pp. 1037–1050, Apr. 2016.

[32] Y. Shi, J. Zhang, and K. Letaief, “Robust group sparse beamforming formulticast green cloud-RAN with imperfect CSI,” IEEE Trans. SignalProcess., vol. 63, no. 17, pp. 4647–4659, Sep. 2015.

[33] X. Peng, Y. Shi, J. Zhang, and K. B. Letaief, “Layered group sparsebeamforming for cache-enabled green wireless networks,” IEEE Trans.Commun., vol. 65, no. 12, pp. 5589–5603, Dec. 2017.

[34] M. Tao, E. Chen, H. Zhou, and W. Yu, “Content-centric sparse multicastbeamforming for cache-enabled cloud RAN,” IEEE Trans. WirelessCommun., vol. 15, no. 9, pp. 6118–6131, Sep. 2016.

[35] J. Zhao, T. Q. S. Quek, and Z. Lei, “Heterogeneous cellular networksusing wireless backhaul: Fast admission control and large systemanalysis,” IEEE J. Sel. Areas Commun., vol. 33, no. 10, pp. 2128–2143,Oct. 2015.

[36] J. Cheng, Y. Shi, B. Bai, and W. Chen, “Computation offloadingin cloud-ran based mobile cloud computing system,” in Proc. IEEEInt. Conf. Commun. (ICC), Kuala Lumpur, Malaysia, May 2016,pp. 1–6.

[37] W.-C. Liao, M. Hong, Y.-F. Liu, and Z.-Q. Luo, “Base station activationand linear transceiver design for optimal resource management inheterogeneous networks,” IEEE Trans. Signal Process., vol. 62, no. 15,pp. 3939–3952, Aug. 2014.

[38] F. Rusek et al., “Scaling up MIMO: Opportunities and challenges withvery large arrays,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 40–60,Jan. 2013.

[39] R. Couillet and M. Debbah, “Signal processing in large systems: Anew paradigm,” IEEE Signal Process. Mag., vol. 30, no. 1, pp. 24–39,Jan. 2013.

[40] J. A. Tropp, “An introduction to matrix concentration inequalities,”Found. Trends Mach. Learn., vol. 8, pp. 1–230, May 2015.

[41] R. Couillet, G. Wainrib, H. T. Ali, and H. Sevi, “A random matrixapproach to echo-state neural networks,” in Proc. 33th Int. Conf. Mach.Learn. (ICML), 2016, pp. 517–525.

[42] S. Bi, R. Zhang, Z. Ding, and S. Cui, “Wireless communications in theera of big data,” IEEE Commun. Mag., vol. 53, no. 10, pp. 190–199,Oct. 2015.

[43] S. Lakshminarayana, M. Assaad, and M. Debbah, “Coordinated multicellbeamforming for massive MIMO: A random matrix approach,” IEEETrans. Inf. Theory, vol. 61, no. 6, pp. 3387–3412, Jun. 2015.

[44] K. Lange and J. S. Sinsheimer, “Normal/independent distributions andtheir applications in robust regression,” J. Comput. Graph. Statist., vol. 2,no. 2, pp. 175–198, 1993.

[45] H. Dahrouj and W. Yu, “Coordinated beamforming for the multicellmulti-antenna wireless system,” IEEE Trans. Wireless Commun., vol. 9,no. 5, pp. 1748–1759, May 2010.

[46] S. P. Boyd and L. Vandenberghe, Convex Optimization. Cambridge,U.K.: Cambridge Univ. Press, 2004.

Page 14: IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. …shiyuanming.github.io/papers/Journal/rmt_twc18.pdf · turns out to be a mixed-integer nonlinear programming prob-lem [12], which

2524 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 17, NO. 4, APRIL 2018

[47] R. T. Rockafellar, Convex Analysis, vol. 28. Princeton, NJ, USA:Princeton Univ. Press, 1997.

[48] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.:Cambridge Univ. Press, 2012.

Yuanming Shi (S’13–M’15) received the B.S.degree in electronic engineering from Tsinghua Uni-versity, Beijing, China, in 2011, and the Ph.D. degreein electronic and computer engineering from TheHong Kong University of Science and Technologyin 2015. Since 2015, he has been with the School ofInformation Science and Technology, ShanghaiTechUniversity, as a tenure-track Assistant Professor.He visited the University of California at Berkeley,Berkeley, CA, USA, from 2016 to 2017. He wasa recipient of the 2016 IEEE Marconi Prize Paper

Award in Wireless Communications, and the 2016 Young Author Best PaperAward by the IEEE Signal Processing Society. He was involved in opti-mization theory and algorithms, deep and machine learning, high-dimensionalprobability and statistics, with applications in intelligent IoT, large-scale dataanalysis, and distributed edge computing.

Jun Zhang (S’06–M’10–SM’15) received theB.Eng. degree in electronic engineering from theUniversity of Science and Technology of Chinain 2004, the M.Phil. degree in information engi-neering from the Chinese University of Hong Kongin 2006, and the Ph.D. degree in electrical and com-puter engineering from The University of Texas atAustin in 2009. He is currently a Research AssistantProfessor with the Department of Electronic andComputer Engineering, The Hong Kong Universityof Science and Technology. He has co-authored the

book Fundamentals of LTE (Prentice-Hall, 2010). His research interestsinclude dense wireless cooperative networks, mobile edge computing, cloudcomputing, and big data analytics systems.

Dr. Zhang was a co-recipient of the 2016 Marconi Prize Paper Award inWireless Communications, the 2014 Best Paper Award for the EURASIPJournal on Advances in Signal Processing, the IEEE GLOBECOM Best PaperAward in 2017, the IEEE ICC Best Paper Award in 2016, and the IEEEPIMRC Best Paper Award in 2014. A paper he co-authored received the2016 Young Author Best Paper Award of the IEEE Signal Processing Society.He also received the 2016 IEEE ComSoc Asia-Pacific Best Young ResearcherAward. He frequently serves on the technical program committees of majorIEEE conferences in wireless communications, such as ICC, Globecom,WCNC, VTC, and so on, and served as a MAC track co-chair for the IEEEWCNC 2011. He is an Editor of the IEEE TRANSACTIONS ON WIRELESSCOMMUNICATIONS, and was a Guest Editor of the special section on MobileEdge Computing for Wireless Networks in the IEEE ACCESS.

Wei Chen (S’05–M’07–SM’13) received the B.S.and Ph.D. degrees (Hons.) from Tsinghua Univer-sity in 2002 and 2007, respectively. From 2005 to2007, he was also a visiting Ph.D. student at TheHong Kong University of Science & Technology.Since 2007, he has been on the faculty at TsinghuaUniversity, where he is a tenured Full Professor,the Director of the Academic Degree Office ofTsinghua University, and a member of the Uni-versity Council. From 2014 to 2016, he served asthe Deputy Head of the Department of Electronic

Engineering. He has also held visiting appointments at several universities,including most recently at Princeton. His research interests are in the areasof wireless networks and information theory.

Dr. Chen is a member of the National 10000-Talent Program, a CheungKong Young Scholar, and a Chief Scientist of the National 973 Youth Project.He is also supported by the NSFC Excellent Young Investigator Project,the New Century Talent Program of Ministry of Education, and the BeijingNova Program. He received the first prize of the 14th Henry Fok Ying-TungYoung Faculty Award, the 17th Yi-Sheng Mao Science & Technology Awardfor Beijing Youth, the 2017 Young Scientist Award of the China Institute ofCommunications, and the 2015 Information Theory New Star Award of theChina Institute of Electronics. He received the IEEE Marconi Prize PaperAward and the IEEE Comsoc Asia Pacific Board Best Young ResearcherAward in 2009 and 2011, respectively. He is a winner of the National May 1stMedal and Beijing Youth May 4th Medal. He also serves as the ExecutiveChairman of the Youth Forum of the China Institute of Communications.He serves as an Editor of the IEEE TRANSACTIONS ON COMMUNICATIONSand the IEEE TRANSACTIONS ON EDUCATION.

Khaled B. Letaief (S’85–M’86–SM’97–F’03)received the B.S. (Hons.), M.S., and Ph.D. degreesin electrical engineering from Purdue University,West Lafayette, IN, USA, in 1984, 1986, and 1990,respectively. From 1990 to 1993, he was a Fac-ulty Member with The University of Melbourne,Melbourne, VIC, Australia. Since 1993, he has beenwith The Hong Kong University of Science andTechnology (HKUST), where he is known as oneof HKUST’s most distinguished professors. Whileat HKUST, he has held numerous administrative

positions, including the Head of the Electronic and Computer EngineeringDepartment, the Director of the Huawei Innovation Laboratory, and theDirector of the Hong Kong Telecom Institute of Information Technology.While at HKUST, he has also served as the Chair Professor and the Dean ofEngineering. Under his leadership, the school has also dazzled in internationalrankings (rising from No. 26 in 2009 to 14 in the world in 2015 accordingto QS World University Rankings). In 2015, he joined Hamad Bin KhalifaUniversity as a Provost to help establish a research-intensive university inQatar in partnership with esteemed universities that include NorthwesternUniversity, Cornell University, and Texas A&M. He has over 580 journaland conference papers and has given keynote talks and courses all over theworld. He has 15 patents, including 11 U.S. patents. He is an internationallyrecognized leader in wireless communications with research interests in greencommunications, Internet of Things, cloud-RANs, and 5G systems. He servedas a consultant for different organizations, including Huawei, ASTRI, ZTE,Nortel, PricewaterhouseCoopers, and Motorola. In addition to his activeresearch and professional activities, he has been a dedicated teacher committedto excellence in teaching and scholarship. He is a fellow of HKIE. He receivedthe Michael G. Gale Medal for Distinguished Teaching (highest university-wide teaching award and only one recipient/year is honored for his/hercontributions). He was a recipient of many other distinguished awards andhonors, including the 2007 IEEE Joseph LoCicero Publications ExemplaryAward, the 2009 IEEE Marconi Prize Award in Wireless Communications,the 2010 Purdue University Outstanding Electrical and Computer EngineerAward, the 2011 IEEE Harold Sobol Award, the 2016 IEEE Marconi PrizePaper Award in Wireless Communications, and over 11 IEEE best paperawards. He is recognized as a longtime volunteer with dedicated service toprofessional societies and in particular IEEE, where he has served in manyleadership positions including as Treasurer, a Vice President for Conferences,and a Vice President for Technical Activities of the IEEE CommunicationsSociety. He is also recognized by Thomson Reuters as an ISI Highly CitedResearcher (which places him among the top 250 preeminent individualresearchers in the field of computer science and engineering) and is currentlythe President of the IEEE Communications Society in 2018. He has servedon the editorial board of other prestigious journals. He is the founding Editor-in-Chief of the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS.