c 703 acta - jultika.oulu.fijultika.oulu.fi/files/isbn9789526222431.pdf · i am also thankful to...

UNIVERSITY OF OULU P .O. Box 8000 F I -90014 UNIVERSITY OF OULU FINLAND

A C T A U N I V E R S I T A T I S O U L U E N S I S

University Lecturer Tuomo Glumoff

University Lecturer Santeri Palviainen

Senior research fellow Jari Juuti

Professor Olli Vuolteenaho

University Lecturer Veli-Matti Ulvinen

Planning Director Pertti Tikkanen

Professor Jari Juga

University Lecturer Anu Soikkeli


Publications Editor Kirsti Nurkkala

ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)

U N I V E R S I TAT I S O U L U E N S I SACTAC

TECHNICA


TECHNICA

OULU 2019

C 703

Kien Vu

INTEGRATED ACCESS-BACKHAUL FOR 5G WIRELESS NETWORKS

UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU,FACULTY OF INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING;CENTRE FOR WIRELESS COMMUNICATIONS

C 703

AC

TAK

ien Vu

C703etukansi.fm Friday, April 5, 2019 10:48 AM

ACTA UNIVERS ITAT I S OULUENS I SC Te c h n i c a 7 0 3

KIEN VU


Academic dissertation to be presented with the assent ofthe Doctoral Training Committee of InformationTechnology and Electrical Engineering of the University ofOulu for public defence in the OP auditorium (L10),Linnanmaa, on 13 May 2019, at 12 noon

UNIVERSITY OF OULU, OULU 2019

Copyright © 2019Acta Univ. Oul. C 703, 2019

Supervised byProfessor Matti Latva-ahoAssociate Professor Mehdi Bennis

Reviewed byProfessor Petar PopovskiAssociate Professor Ming Xiao

ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)

ISSN 0355-3213 (Printed)ISSN 1796-2226 (Online)

Cover DesignRaimo Ahonen

JUVENES PRINTTAMPERE 2019

OpponentProfessor Risto Wichman

Vu, Kien, Integrated access-backhaul for 5G wireless networks. University of Oulu Graduate School; University of Oulu, Faculty of Information Technologyand Electrical Engineering; Centre for Wireless CommunicationsActa Univ. Oul. C 703, 2019University of Oulu, P.O. Box 8000, FI-90014 University of Oulu, Finland

Abstract

With the unprecedented growth in mobile data traffic and network densification, the emergingfifth-generation (5G) wireless network warrants a paradigm shift with respect to system designand technological enablers. In this regard, the prime motivation of this thesis is to propose anintegrated access-backhaul (IAB) framework to dynamically schedule users, while efficientlyproviding a wireless backhaul to dense small cells and mitigating interference. In addition, jointresource allocation and interference mitigation solutions are proposed for two-hop and multi-hopself-backhauled millimeter wave (mmWave) networks.

The first contribution of this thesis focuses on a multi-user two-hop relay cellular system inwhich a massive antenna array enabled macro base station (BS) simultaneously provides highbeamforming gains to outdoor users, and wireless backhauling to outdoor small cells. Moreover,a hierarchical interference mitigation scheme is applied to efficiently mitigate cross-tier and co-tier interference.

In the second contribution, a multi-hop self-backhauled mmWave communication scenario isstudied whereby a joint multi-hop multi-path selection and rate allocation framework is proposedto enable Gbps data rates with reliable communications. Using reinforcement learning techniques,a dynamic and efficient re-routing solution is proposed to cope with blockage and latencyconstraints. Finally, a risk-sensitive learning solution is leveraged to provide high-reliability andlow-latency communications.

In summary, the dissertation analyses key trade-offs between (i) capacity and latency, (ii)reliability and network density. Extensive simulation results were carried out to verify theperformance gains of the proposed algorithms compared to several baselines and for differentnetwork settings. Key findings show significant improvements in terms of higher data rates, lowerlatency, and reliable communications with some trade-offs.

Keywords: 5G, integrated access and backhaul, latency, massive MIMO, mmWavecommunications, reliability, ultra-dense networks

Vu, Kien, Integroitu liityntä- ja runkoverkkoyhteys langattomiin 5G-verkkoihin. Oulun yliopiston tutkijakoulu; Oulun yliopisto, Tieto- ja sähkötekniikan tiedekunta; Centre forWireless CommunicationsActa Univ. Oul. C 703, 2019Oulun yliopisto, PL 8000, 90014 Oulun yliopisto

Tiivistelmä

Liikkuvan dataliikenteen ennennäkemättömän kasvun ja verkkojen tihentymisen seurauksenapian käyttöön tulevien viidennen sukupolven (5G) langattomien verkkojen järjestelmäsuunnitte-lua ja teknologisten mahdollistajien käyttöä on täytynyt lähestyä kokonaan uudesta näkökulmas-ta. Niinpä tämän väitöstyön johtavana ajatuksena on ehdottaa integroitua verkkoon pääsyn jarunkoverkkoyhteyden muodostamismallia, jossa käyttäjät resursoidaan dynaamisesti ja samallamuodostetaan tehokkaat runkoverkkoyhteydet piensoluille. Tätä varten tutkitaan resurssiallokaa-tion ja häiriöiden lieventämisen yhteisratkaisuja, jotka tukevat kahden tai useamman hypynyhteyksiä ja samanaikaista runkoverkkoyhteyden luomista millimetriaaltoalueen verkoissa.

Työn alkuosa keskittyy usean käyttäjän välitinavusteiseen kahden hypyn solukkoverkkoon,jossa makrotukiasemassa käytetään suurta antenniryhmää muodostamaan samanaikaisesti suurenvahvistuksen antennikeiloja käyttäjälinkeille ja langattomalle runkoyhteysosuudelle. Lisäksisovelletaan hierarkkista häiriönvaimennusmenetelmää saman kerroksen ja kerrosten välisen häi-riön tehokkaaseen vähentämiseen.

Työn seuraavassa osassa arvioidaan usean hypyn runkoverkkoyhteyden muodostuksen tutki-musongelmaa millimetrialueen kommunikaatiossa kehittämällä yhdistetty menetelmä useanhypyn monipolkuvalinnalle ja tiedonsiirtoresurssien allokoinnille. Tällä tähdätään gigabittiluo-kan datanopeuksiin ja luotettavaan tietoliikenteeseen millimetrialueella. Vahvistavan oppimisentekniikan avulla esitellään dynaaminen ja tehokas uudelleenreitityskonsepti toimimaan esto- javiiverajoitusten kanssa. Lopuksi hyödynnetään riskisensitiivistä oppimista ja antennidiversiteet-titekniikoita suuren luotettavuuden ja pienen latenssin saavuttamiseksi millimetrialueen tiedon-siirrossa.

Näiden avulla analysoidaan kaupankäyntiä esimerkiksi (i) kapasiteetin ja latenssin sekä (ii)luotettavuuden ja verkon tiheyden/kuormituksen välillä. Mittavien suoritettujen simulointienavulla osoitetaan ehdotettujen algoritmien suorituskykyedut suhteessa tunnettuihin verrokkeihinuseissa eri skenaarioissa. Tulosten perusteella saavutetaan merkittäviä kustannussäästöjä infra-struktuurin ja runkoverkon osalta sekä päästään suuriin datanopeuksiin ja parannuksiin pienenlatenssin luotettavassa tietoliikenteessä.

Asiasanat: integroitu verkkoon pääsy ja runkoverkkoyhteys, keilanmuodostuksensuunnittelu, massiivinen MIMO, millimetriaaltoalueen tietoliikenne, ultratiheäpiensoluverkko

Dedicated to my friends

and to my family

Preface

This work was carried out at the Centre for Wireless Communications (CWC) and the

Faculty of Information Technology and Electrical Engineering (ITEE) at the University

of Oulu, in Finland from November 2014. However, this work would not have been

possible without the encouragement, help, and guidance that I received over the years

from many individuals.

First, I would like to express my sincerest gratitude to my supervisors, Professor

Matti Latva-aho and Associate Professor Mehdi Bennis, for providing the opportunity

to pursue my doctoral studies. I greatly appreciate their vast knowledge and inspiring

ideas, which cover a multitude of areas as well as their continual support, guidance,

and encouragement throughout my postgraduate research and studies. Furthermore, I

would like to thank Professor Mérouane Debbah from Huawei R&D in France for his

valuable support and comments on my work. It was an honour to carry out my doctoral

research under his guidance.

I would also like to thank my follow-up group, Professor Markku Juntti, and Ad-

junct Professor Pekka Pirinen from the University of Oulu for their insightful advice

and discussions during my doctoral studies. I further wish to express my gratitude to

the pre-examiners of this thesis, Professor Petar Popovski from Aallborg University, in

Denmark, and Associate Professor Ming Xiao from KTH, Royal Institute of Technol-

ogy, in Sweden for their constructive comments, and Professor Risto Wichman from

the Aalto University, in Finland, for acting as prestigious opponent in my doctoral de-

fence. I would also like to thank Dr. Le-Nam Tran from University College Dublin and

Dr. Vinh Phan from Nokia Bell Labs for their invaluable advice and supports during

my doctoral studies. Furthermore, I would like to thank Professor ZhiSheng Niu for

hosting my research visit at Tsinghua University, and would also like to thank Chen

Sheng for his company and support. Importantly, I would like to thank all the editors

and all anonymous reviewers for their constructive comments on the work.

This thesis was financially supported in part by the Academy of Finland 6Genesis

Flagship project (grant 318927). While conducting the research, I was able to work

on several projects. The work has been supported by the Academy of Finland un-

der the project 5Gto10G, the Software Defined Hyper Cellular Architecture project for

Green and Smart Service Provisioning in 5G Networks (HYPER5G), the Higher Fre-

9

quency 5G Communications (High5) project, the Academy of Finland funding via grant

307492 and the CARMA grants 294128 and 289611, and the project SMARTER. I have

also been fortunate to receive personal grants from the Nokia Foundation, the Finnish

Foundation for Technology Promotion, the Riitta and Jorma J. Takanen Foundation, the

UNIOGS travel grant, and the Tauno Tönning Foundation. All of these funders are

highly appreciated.

Alone we can do so little; but together we can do so much. I would like to thank

our research team Sumudu, Chen-Feng, Elbamby, Cristina, Petri, Mohamed, Jihong,

Hamza, Hamid, Anis, and Mounssif for the countless productive discussions and meet-

ings. My special thanks aslo go to Sumudu for his great help all the time. He always

provided nice discussions and made good suggestions for the research and other practi-

cal things. Chen-Feng, Cristina, and Elbamby were also very supportive in any practical

issues at work, and we also had nice after work conversations. The countless friendly

discussions on professional as well as personal matters with them allowed me to stay

focused and high-spirited, for which I am truly grateful. Life without coffee is impos-

sible, and I would like to thank Elbamby for his company and unfailing support. I

would also like to thank Giang, Satya, and Doanh who provided a lot of valuable advice

regarding my research problem. I would like to thank Ayotunde and Manosa for all

the moments we have shared in the office TS 414. Further, I would like to thank all

my CWC colleagues for maintaining an inspiring and supportive working environment.

They are many just to name but include Parisa, Iran, Jiquang, Qiang, Moiz, Samad,

Inosha, Makus, Oskari, Nuutti, Jari Marjakangas, Hamidreza, Tachporn, Mojtaba and

many others. I also would like to thank the administrative staff from CWC and UoU,

including Kirsi Ojutkangas, Jari Sillanpää, Eija Pajunen and Anu Niskanen for their

unfailing support and assistance.

I am also thankful to the Vietnamese community: Thang, Khanh&Ha, Nhat, Kien

Ngo, Lan, Ha, Vu, Minh Thuy, Thao Pham, Thao Duong, Hoang, Tam, Dung, Linh, Dat,

Phong, Tri, Lam&Linh, Hong, and many others for their friendships and memorable

moments and the rest who are and were here during past couple of years. Life in

Oulu would not have been as wonderful without them. My special thanks also go to

JP and Mai, Tai for their friendships and for making our gatherings so memorable. I

would like to thank the family of Sxu, Phuong, and Tara, and Mr. Xuan Bao for their

understanding, friendship, and for helping me stay sane through these difficult times

since the day I came to Finland. Without their advice and support, my life probably

would have proceeded down the wrong path, and I am so grateful for all that Phuong

10

and Sxu have done. I also would like to thank the family of Jussi and Miia for their

friendships and encouragements. Special thanks also go to my friends far away in

Vietnam: Anh Tuan, Hai, Hoang Anh, Nam, Chung, Nguyen Tuan, Binh, Trang, Thuc,

Hiep, Thang, Tran Dung and in South Korea: Khanh, Duc, Quoc Hoan, Ngoc Hoan,

Anh Tuan, Minh Luan and many others.

Last and definitely not least, I would not be standing here without the endless love,

support, and inspiration from all my relatives Binh Minh, Huu Khanh, Phuong Le, Mai

Huong, Khanh Huong and my big family. I would like to thank my parents, my sister,

my nephew, and my girlfriend for their love and caring, and for being the ultimate

courage and strength of my life.

Oulu, November 2018.

Kien Vu

11

List of abbreviations

Acronyms:

5G 5th generation

BG Boltzmann-Gibbs

BS Base station

CA Closed access

CCDF Complementary cumulative density function

CCP Convex-concave procedure

CDF Cumulative density function

CSI Channel state information

DC Difference of convex function

DL Downlink

DPP Drift-plus-penalty

FD Full-duplex

HA Hybrid access

HD Half-duplex

HetNet Heterogeneous network

HomNet Homogeneous network

INR Interference and noise ratio

ISD Inter-site distance

KKT Karush-Kuhn-Tucker

LOS Line-of-sight

MBS Macro cell base station

MIMO Multiple-input multiple-output

MISO Multiple-input single-output

mmWave millimeter wave

MUE Macro cell user equipment

NLOS Non line-of-sight

NUM Network utility maximization

OA Open access

PF Proportional fair

QoS Quality-of-service

13

QSI Queue state information

RAN Radio access network

RHS Right-hand side

RL Reinforcement learning

RMT Random matrix theory

RSL Risk-sensitive reinforcement learning

RZF Regularized zero-forcing

SC Small cell base station

SCA Successive convex approximation

SIC Self-interference cancellation

SINR Signal to interference and noise ratio

SNR Signal to noise ratio

SOCP Second-order cone programming

SUE Small cell user equipment

TDD Time division duplexing

TNU Total network utility

UDN Ultra-dense network

UE User equipment

UL Uplink

URC Ultra-reliable communication

URLLC Ultra-reliable low latency communication

UT User throughput

WSRM Weighted sum rate maximization

ZF Zero-forcing

Roman-letter notations:

am Data arrival destined for UE m

am Mean arrival rate at UE m

amaxm Maximum data arrival destined for UE m

B Number of all base stations

H(b) Channel matrix between all UEs and the BS b in chapter 3

h(b)m Channel vector between the mth UE and the BS b in chapter 3

h(b,n)m Channel between the mth MUE and the nth antenna of BS b in chapter

3

14

H(b),M Estimate of channel matrix H(b),M in chapter 3

h(bs)u Channel vector between the uth UE and the SC bs in chapter 3

H(i, j) Channel matrix between transmitter i and receiver j in chapter 4

H(i, j) Estimate of channel matrix between transmitter i and receiver j in chap-

ter 4

Hm Channel matrix between the MBS and a UE m in chapter 5

Hm Estimate of channel matrix between the MBS and a UE m in chapter 5

Hbk Channel matrix between BS b and UE k in chapter 6

Hbk Estimate of channel matrix between BS b and UE k in chapter 6

K Number of user equipments

l Load balancing variable vector

l(bs)cs Transmission association indicator from SC bs to SUE cs

l(bs)m Transmission association indicator from BS bs ∈ B to UE m

l(b0)s+M Transmission association indicator from MBS b0 to SC s

M Number of macro user equipments

N Number of antennas at the MBS

Ns Number of antennas at the SC s

Naus Number of active users at SC s

Ntxs Total number of transmissions at SC

P Transmit power matrix

p Transmit power allocation vector

P(b0)orP Maximum transmit power at the MBS

p(b0)m DL MBS transmit power assigned to UE m

p(b0)s+M DL MBS transmit power assigned to SC s

Q Network queue at the MBS

Qm Network queue backlog for UE m

r Ergodic data rate vector of all UEs

r Time average expectation of the Ergodic data rate vector of all UEs

r(b0)m Ergodic data rate at the mth UE from the BS b

S Number of small cell base stations

T Co-tier interference mitigation precoding matrix at the MBS

U Cross-tier interference mitigation precoding matrix at the MBS

V Precoding matrix at the MBS

vm Precoding vector of UE m

w(b0)m Real small-scale fading channel matrix

15

w(b0)m Estimate of the small-scale fading matrix

x(b)m Signal symbol at the mth MUE from the BS b

Y Virtual queue vector for auxiliary variables

y(b0)m Received signal at the mth UE from the BS b

h(bs)u Received signal at the uth UE from the SC bs

z(b0)m Small-scale fading channel noise matrix at UE m

Mathcal-style notations:

B Set of all base stations

F Set of data flows

K Number of single-antenna UEs

L Set of all directional edges

M Number of macro UEs

N Set of all nodes

N(o)

i Set of all nodes

N(o)

i Set of the next hops from node i

R Average rate region

S Set of small cell base stations

Z f Set of disjoint paths observed by flow f

Greek-letter notations:

α Network control action

β Network random event

χ Approximation factor

δ Slack variable for SCA method in chapter 3

ε Reliable target

ηm Thermal noise at UE m

ι Learning rate

κ Learning temperature rate

µ Risk-sensitive factor

ν Lyapunov control parameter

ωm Weight of user m

16

φ Operation mode to control the FD-enabled SC transmission

π Probability of choosing an action

Φ Regret rate

ΦΦΦ Regret rate vector

σ2 Thermal noise covariance

τm Channel estimate error of UE m

ΘΘΘ Spatial channel correlation matrix

θ Beamwidth

ϕ Auxiliary variable

ζ Regularized zero forcing parameter

∆(·) Lyapunov drift function

ΛΛΛ Composite control variable

ΛΛΛo Composite control variable of load balancing and operation mode

ΩΩΩ Solution of the Stieltjes transformation

ΨΨΨ Lyapunov independent constant

ΞΞΞ Queue backlog vectors

ℵ Flow utility in chapter 4

k0 Allowed FD INR threshold

k(bs)m FD INR from FD-enabled SC bs to UE m

Mathematical operator notations and symbols:

|X | Cardinality of the set X

diag(xxx) Diagonal matrix with xxx as the diagonal

‖xxx‖ Euclidean norm of vector xxx

E[·] Expectation function

1x(X ) Indicator function, i.e. returns 1 if x ∈ X , 0 otherwise

Pr(·) Probability of the event

x+ Returns x if x > 0 and 0 otherwise

Z Set of integers

R Set of real numbers

(·)⋆ Solution of an optimization problem

XT Transpose of matrix X

X† Hermitian of matrix X

Rank(X) Rank of matrix X

17

Contents

Abstract

Tiivistelmä

Preface 9

List of abbreviations 13

Contents 19

1 Introduction 23

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.2 5G technologies for mobile broadband . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.2.1 Ultra-dense small cell networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

1.2.2 Massive MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.2.3 Millimeter wave communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.3 Scope of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.4 Author’s contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

2 Research methodologies 31

2.1 Stochastic optimization for wireless networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.1.1 Queuing networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.1.2 Auxiliary variables and virtual queues introduction . . . . . . . . . . . . . . . 32

2.1.3 Lyapunov optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2 A successive convex approximation technique . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.3 Random matrix theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.4 Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 Integrated access and backhaul architecture 41

3.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.3 Load balancing and interference mitigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3.1 Downlink transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.3.2 Joint load balancing and interference mitigation . . . . . . . . . . . . . . . . . . 48

3.3.3 Closed-form expression via a deterministic equivalent . . . . . . . . . . . . . 50

3.4 Proposed load balancing and interference mitigation . . . . . . . . . . . . . . . . . . . . . 52

3.4.1 Joint load balancing and operation mode selection . . . . . . . . . . . . . . . . 54

3.4.2 Auxiliary variable optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

19

3.4.3 Interference mitigation and power allocation . . . . . . . . . . . . . . . . . . . . . 57

3.4.4 Queue update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.5.1 Simulation environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.5.2 Ultra-dense small cells environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.5.3 Wireless backhaul impact for different transmit power levels . . . . . . . 62

3.5.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4 Self-backhauled multi-hop architecture 67


4.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.2.1 Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70

4.2.2 mmWave MIMO channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

4.2.3 Transmission rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72

4.2.4 Network queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.4 Proposed path selection and rate allocation algorithm . . . . . . . . . . . . . . . . . . . . 76

4.4.1 Path selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.4.2 Rate allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80


4.5.1 Small antenna array system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.5.2 Large antenna array system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.5.3 Convergence characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5 Low-latency communication in massive MIMO wireless networks 93


5.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.4 Proposed control parameter selection and power allocation . . . . . . . . . . . . . . . 96

5.4.1 Control parameters selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98

5.4.2 Power allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99


5.5.1 Impact of the arrival rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5.5.2 Impact of user density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

20

6 Ultra-reliable communication in 5G mmWave networks 103

6.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

6.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.4 Proposed distributed learning algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

6.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7 Conclusions and future work 113

7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

References 117Appendices 129

21

1 Introduction

1.1 Motivation

The unprecedented growth in data traffic, driven by the massive number of connected

wireless devices (e.g., mobile phones, laptops, sensing devices) and rich content ap-

plications (e.g., video and game streaming, augmented and virtual reality), is posing

unprecedented challenges in terms of extreme data rates, low latency, high reliability,

and scalability. The fifth generation (5G) wireless systems are expected to meet these

challenges, which require a paradigm shift in system design and radio technologies.

According to the international telecommunication union (ITU), 5G encompasses three

service categories: enhanced mobile broadband (eMBB), ultra-reliable and low latency

communication (URLLC), massive machine-type communication (mMTC) [1, 2]. In

particular, eMBB aims at providing users with high peak data rates, and moderate rates

for cell-edge users; URLLC supports low-latency transmissions with very high relia-

bility [3, 4, 5, 6]; mMTC supports a massive number of IoT devices [7, 8]. In this

regard, both academia and industry have paid tremendous attention to the underutilised

mmWave frequency bands (30− 300 GHz) due to the current scarcity of the wireless

spectrum [9, 10, 11, 12, 13]. Meeting these traffic demands can be achieved by (i)

advanced spectrally-efficient techniques, e.g., massive multiple-input multiple-output

(MIMO) [14, 15, 16]; and (ii) ultra-dense self-backhauled small cell (SC) deploy-

ments [17, 18, 19, 20, 21, 22]. Indeed, massive MIMO is instrumental in leveraging

mmWave frequency bands and providing wireless backhauls and access in ultra-dense

network deployments. Furthermore, network densification is a promising technique

to boost capacity and extend coverage by reducing the communication range between

users and base stations.

This thesis examines three 5G enablers, namely mmWave communications, mas-

sive MIMO and ultra-dense small cells in which the goal is to design and optimize an

integrated access and backhaul deployment. A potential use case of the thesis is aug-

mented and virtual reality that requires extreme data rates and very low latency. In

general, end-to-end latency can be defined as the time taken for a packet to be gen-

erated in a protocol layer at the source through the network to the same layer at the

destination, which includes the over-the-air transmission delay, propagation delay, pro-

cessing/computing delay, retransmission, and queuing delay [5, 6]. Reliability can be

23

MBSFD-SC

MUE

Massive MIMO Antennas

D: Queue buffer

Q: Network Queue

Data

FD-SC

SUEwireless

backhaul

dataaccess

MUE served by either MBS

or nearby FD-SCs

SUE served by SC only

and interfered by MBS

MUEMUE

MUE

SUE

SUE

interfering signal

useful signal

FD-SC

Fig. 1. Integrated access and backhaul architecture for the considered 5G network scenario

([23] c©2017 IEEE).

defined as the probability that for a given deadline a packet is successfully received at

the destination. This thesis focuses on the downlink (DL) transmission and the queue-

ing delay, and addresses the following fundamental questions:

– Q1: How can ultra-dense SCs be deployed to serve a large density of UEs in a multi-

user two-hop relay IAB scenario as shown in Fig. 1 ?

– Q2: How should paths be selected and transmission rates be allocated in multi-hop

multi-path self-backhauled mmWave networks as shown in Fig. 2 ?

– Q3: How can low latency communication be enabled for outdoor UEs with eMBB

services in massive MIMO wireless systems ?

– Q4: How can ultra-reliable communication be provided in ultra-dense SC networks

in the presence of risk and uncertainty ?

1.2 5G technologies for mobile broadband

This section briefly introduces some of the main concepts of ultra-dense small cells,

massive MIMO and millimeter-Wave communications, which are relevant for the scope

of the thesis.

24

Macro BS

Self-backhauled SCBS

UE 1

UE 2

Traffic aggregation

Route 1

Route 2

Route 4

Route 3

Traffic split

Full-duplex communication

UE K.....

UE k

One - hop transmission range

Fig. 2. Illustration of 5G multi-hop self-backhauled mmWave networks ([24] c©2018 IEEE).

1.2.1 Ultra-dense small cell networks

In order to boost network capacity and expand coverage, the concept of deploying

low-cost, low-power SCs over traditional macro cell networks has been investigated

[18, 19, 21, 20]. Dense SC deployment brings users closer to the base stations result-

ing in improved wireless connectivity. With multiple-antenna arrays at the SCs, hybrid

beamforming can be leveraged to achieve higher transmission gains, reduce the trans-

mitting power, and mitigate interference between co-tier users. In addition, by exchang-

ing statistic channel information between base stations, cross-layer interference can be

reduced by a proper hierarchical beamforming design [25] or through cooperative in-

terference avoidance/managament schemes [26, 27, 28, 29]. Furthermore, the concept

of cache-enabled SCs was introduced to reduce the backhaul load and improve user

experience in [30].

Recent advances in full-duplex (FD) communication offer the potential to double ca-

pacity and lower latency in which in-band FD-enabled SCs relay data from a macro BS

to UEs in the same frequency band [31, 32, 33]. As a result, FD enables ultra-dense SC

deployments by using wireless backhaul [34, 35, 36, 37]. In particular, with FD com-

munication, SCs transmit and receive signals at the same time using self-interference

cancellation algorithms [38]. Instead of using a wired backhaul the SCs are connected

to the core network via a macro BS over a wireless backhaul, thereby reducing the de-

25

ployment cost as compared to traditional cellular networks [39]. In this regard, large

antenna arrays employed at the macro BSs provide high directional beamforming for

the SCs [21, 22].

Dense SC deployment has great potential for improving the network capacity, but

it faces some challenges, such as interference mitigation and resource management,

backhaul/fronthaul limitations. In this regard, this thesis addresses joint load balancing

and interference mitigation optimization under wireless backhaul constraints.

1.2.2 Massive MIMO

The basic concept of massive MIMO is to utilize hundreds or thousands of antennas at

the BS to serve up to tens or hundreds of UEs [14, 15, 16, 40]. Massive MIMO has

been recognized as one of the most promising 5G techniques, which yields remarkable

properties such as high signal-to-interference-plus-noise ratio (SINR) due to extreme

spatial multiplexing gains [41, 25, 42]. In addition, the large spatial degree of freedom

(DoF) of massive MIMO enables the mitigation of cross-tier and co-tier interference

through proper hierarchical precoder design at the BS. In massive MIMO sytems, co-

channel time-division duplexing (TDD) is considered in which the macro base stations

and the small cells share the entire bandwidth [21, 15, 43, 42]. In TDD systems, the

channel reciprocity is exploited, and thus, the DL channel can be obtained via the uplink

training phase, which leads to reduced channel training overheads [43, 44]. Importantly,

the channel estimate scales linearly with the number of users and does not depend on

the number of antennas. On the other hand, pilot contamination is considered a ma-

jor performance limiting factor in massive MIMO networks, which occurs when non-

orthogonal pilot sequences are assigned to users. This is why a pilot design should be

taken into account when deploying massive MIMO [45, 46, 47, 48].

1.2.3 Millimeter wave communications

MmWave communications collectively refer to the electromagnetic spectrum between

3 - 300 GHz, which corresponds to wavelengths from 1 mm to 100 mm [9, 10, 1, 13]. A

peculiarity is that mmWave communications experience a high degree of path-loss and

blockage, but have a larger bandwidth with a short wavelength [49, 50, 13]. Thanks to

small wavelengths at higher frequency bands, a large number of antennas can be packed

into a small footprint to achieve highly directional beamforming, which substantially

26

increases link capacity. Large antenna arrays can be deployed at both the transmitter

and receiver, which yields high spatial multiplexing gains and overcome high path-

loss and high noise power (due to the large bandwidth) without additional transmission

power.

In this thesis, the above mentioned 5G technologies are investigated as the key en-

ablers for providing high data rates, low latency, and high reliability. This combination

brings paradigm shifts in terms of system design and fundamental challenges. These

are as follows:

– Integrated access and backhaul: The next generation cellular wireless systems will

be required to dynamically schedule users and efficiently provide a wireless backhaul

to small cells under channel and network dynamics, while satisfying users’ QoS/QoE

requirements [51, 52, 53]. The main contribution of this thesis is to propose an

architecture to support both access and wireless backhaul [54, 55, 56, 23].

– Multi-connectivity and spatial diversity: To improve data rates and reliability,

multi-connectivity and antenna diversity have been studied for decades [57, 58]. For

example, a user can connect to multiple base stations to transmit and receive multiple

copies of data, which improves reliability and capacity [59, 60, 61, 62].

– Load balancing and interference mitigation: The problems of load balancing and

interference mitigation become critical for large number of UEs and BSs [63, 64, 23].

The key questions are how to associate UEs with which BSs, and how to mitigate

both co-tier and cross-tier interference [65, 66, 67, 68].

– Beamforming design and tracking: Beamforming is an important strategy to ob-

tain higher transmission gains and alleviate interference [69, 41, 70, 71]. Last but

not least, in high mobility environments, the problems of beam tracking, mobility

management, and handover are very challenging.

1.3 Scope of the thesis

This thesis consists of seven chapters. The first chapter starts by providing a brief

overview of 5G networks. Following that, the main research questions are formulated.

Then, enablers and challenges concerning 5G technologies are provided. In the second

chapter, the research methodologies used to analyse and optimize the considered net-

work scenarios are introduced. In the next four chapters, the author answers the above

27

research questions. Finally, conclusions are drawn, while highlighting future research

directions. In summary, the contributions of each chapter are as follows:

Chapter 2: The second chapter briefly provides a general background on the mathe-

matical tools used throughout the body of the thesis. Specifically, the basics of

stochastic optimization are introduced to model and solve dynamic network opti-

mization problems. The author then discusses elements of random matrix theory,

which is yet another powerful tool to tackle problems involving high dimensional

data. Due to the non-convex nature of resource allocation problems in wireless

networks, the successive convex approximation technique is used to efficiently

seek local optimal solutions. The final section provides a brief discussion on

reinforcement learning which is instrumental in addressing uncertainty and risky-

events in dynamic stochastic networks.

Chapter 3: This chapter proposes a novel integrated access and backhaul architecture

to study the problem of joint load balancing and interference mitigation in het-

erogeneous networks (HetNets). In particular, a massive MIMO macro cell BS

equipped with a large number of antennas, overlaid with wireless self-backhauled

SCs is assumed. Self-backhauled SC BSs with full-duplex communication em-

ploying regular antenna arrays serve their SC users and offload cell-edge macro

users, by using the wireless backhaul from macro BS in the same frequency band.

The joint load balancing and interference mitigation problem is formulated as

a network utility maximization subject to wireless backhaul constraints. Due

to the non-tractability of the problem, the author applies random matrix theory

to obtain a closed-form expression of the achievable rate and transmit power in

the asymptotic regime, i.e., as the number of antennas and users grows large.

Subsequently, leveraging stochastic optimization, the problem is decoupled into

dynamic scheduling of macro cell users, backhaul provisioning of SCs, and of-

floading macro cell users to SCs as a function of interference and backhaul links.

The proposed algorithm is analysed and validated by taking the impact of SCs

density and transmit power at low and high frequency bands into account.

Chapter 4: In this chapter, a novel solution is proposed to provide Gbps multi-hop

transmissions with latency guarantees. Owing to the severe path loss and unre-

liable transmission over a long distance at higher frequency bands, the author

investigates the problem of path selection and rate allocation for multi-hop self-

28

backhaul mmWave networks. For this purpose, a new system design is advocated

by exploiting multiple antenna diversity, mmWave bandwidth, and traffic split-

ting techniques. The studied problem is cast as a network utility maximization

problem, subject to a probabilistic latency constraint, network stability, and dy-

namics. By leveraging stochastic optimization, the problem is decoupled into:

(i) path selection and (ii) rate allocation sub-problems, whereby a framework

which selects the best paths is proposed using reinforcement learning techniques.

Moreover, the rate allocation is a non-convex program, which is converted into a

convex problem, and solved using the successive convex approximation method.

Chapter 5: This chapter addresses the fundamental question of how to simultaneously

provide orders of magnitude in capacity improvements and latency reduction. In

particular, the problem of low-latency communication (ULC) is investigated in

mmWave-enabled massive multiple-input multiple-output (MIMO) networks. To

address this matter, the Lyapunov optimization framework is extended to incor-

porate probabilistic latency constraints, which takes the queue length, arrival rate,

and channel variations into account. The studied problem is then decoupled into

a dynamic latency control and rate allocation. Here, the latency control problem

is a difference of convex (DC) programming problem, which is solved efficiently

by the convex-concave procedure (CCP).

Chapter 6: In this chapter, another approach is proposed to enhance ultra-reliable

communication (URC) in 5G mmWave massive MIMO networks. In contrast to

the classical network design based on average metrics, our design objective is to

take both the average metrics and variance of the network utility function into ac-

count. Due to the sensitivity of mmWave links, the proposed solution leverages

principles of risk-sensitive reinforcement learning (RSL) and exploits multiple

antenna diversity and higher bandwidth to optimize transmissions and achieve

Gbps data rates. The prime motivation behind using RSL stems from the fact

that the risk-sensitive utility function to be optimized is a function of not only the

average but also the variance, and thus it captures the tail of the rate distribution

to enable URC. To that end, a distributed risk-sensitive reinforcement learning-

based framework is advocated to jointly optimize the beamwidth and transmit

power. Moreover, the proposed algorithm is fully distributed, and does not re-

29

quire full network observation.

Chapter 7: This chapter draws the main conclusions of the thesis and discusses future

research directions.

1.4 Author’s contribution

The author’s research work at the University of Oulu has been published in four journal

papers [23, 24, 72, 73], and two conference papers [74, 75]. The thesis is based on all

these works [23, 24, 72, 73, 74, 75], and provides new radio access solutions to enable

multiple gigabit data rates and ultra-reliable and low latency communications. By ap-

plying advanced signal processing techniques, mathematical optimization frameworks,

and reinforcement learning tools, the research provides important solutions to establish

key trade-offs, between aspects such as capacity and latency, and reliability and net-

work density/traffic loads. As the leading author of all the papers above, the author of

the thesis had the main responsibility in proposing the original ideas, formulating the

problems, deriving the mathematical algorithm, conducting the analysis, developing

and carrying out the simulations, evaluating the numerical results, writing the original

papers, and handling the review process. The co-authors provided invaluable comments,

criticism, and supporting ideas for the research.

30

2 Research methodologies

This chapter describes the mathematical tools used to model and optimize the studied

networks. In particular, the Lyapunov optimization framework, successive convex ap-

proximation method, random matrix theory, and reinforcement learning are sequentially

introduced as follows:

2.1 Stochastic optimization for wireless networks

Stochastic optimization has found applications in wireless networks in the presence of

randomness [76, 77]. For instance, the dynamic nature of wireless channels and stochas-

tic arrivals involves uncertainties and randomness. In the following text, the author

provides a basic introduction of stochastic optimization to model a general stochastic

network optimization problem and solve it by using the Lyapunov drift and a penalty

technique [76, 77].

2.1.1 Queuing networks

Consider a stochastic queuing network that operates in a slotted time t ∈0,1,2, . . . [76,

77]. We assume that there are K queues in the network, and the queuing vector is

Q(t) = (Q1 (t) , · · · ,QK (t)), which stores the data at each time slot t. For instance, one

base station serves up to K users in cellular networks. We first define α (t) as the con-

trol action, i.e., power/spectrum allocation, scheduling, routing, or caching. Let β (t)

denote the random network event, i.e., arrival rate, channels, queue state. Here, Aβ (t)

denotes the set of possible control actions. Let ak (t) = ak(α(t),β (t)) denote the bursty

data arrival for each user k, i.i.d over slot t and its second moment is bounded by some

finite constant. We define the network attribute as x(t) = (x1(t), · · · ,xK(t)) on slot t in

which xk(t) = xk(α(t),β (t)), is referred as the network throughput (serving rate or ad-

mission rate), transmit power, packet drop rate, latency, cost, or profit. We also assume

that the second moment of the network attribute is bounded as the network arrival rate.

We define the network regime R as the convex hull of x(t). The queuing evolution is

given by

Qk (t + 1) = max [Qk (t)− xk (t) ,0]+ ak (t) . (1)

31

ak(t) xk(t)Qk(t)

arrival rate serving rate

Fig. 3. Queuing network model.

Definition 2.1. [Time average expectation] For any vector x(t) = (x1(t), ...,xK(t)), let

x = (x1, · · · , xK) denote the time average expectation of x(t), such that

x , limt→∞1t ∑t−1

τ=0E[x(τ)].

Definition 2.2. [Queue stability] For any discrete queue Q(t) over time slots t ∈ 0,1,

. . . and Q(t) ∈ R+, Q(t) is stable if

Q , limt→∞1t ∑t−1

τ=0E[|Q(τ)|

]< ∞.

A queue network is stable if each queue is stable.

The objective is to determine the actions over time to optimize the following general

stochastic network optimization problem [76, 77]:

max f0 (x) (2a)

subject to g(x)≤ 0, (2b)

i(x) = 0, (2c)

x ∈ R , (2d)

Queue stability, ∀k, (2e)

α(t) ∈ Aβ (t),∀t, (2f)

where f0(x) = ∑Kk=1 ωk f (xk) with ωk(t)≥ 0 is the weight of user k, f (·) is assumed to

be a twice differentiable, concave, and an increasing L-Lipschitz function for all x ∈ R .

g(·) is a continuous convex/non-convex function for all x ∈ R . In addition, i(·) is a

linear (non-linear), continuous function (i.e., power constraint).

2.1.2 Auxiliary variables and virtual queues introduction

To enable the abstract set constraint (2d) to be met and optimize (2) over possibly non-

convex or non-linear functions, we equivalently transform (2) by introducing the auxil-

iary variables ϕϕϕ(t) =(ϕ1(t), . . . ,ϕK(t)

)that satisfy ϕϕϕ(t)≤ x(t), where

ϕk , limt→∞

1

t∑t−1

τ=0Eβ

[ϕk(τ)

]. (3)

32

We can rewrite the constraint functions in (2c) and (2d) as

g(ϕϕϕ), limt→∞

1

t∑t−1

τ=0Eβ

[g(ϕϕϕ (τ))

], (4)

i(ϕϕϕ), limt→∞

1

t∑t−1

τ=0Eβ

[i(ϕϕϕ (τ))

]. (5)

With the above transformation, we convert a function of the time average to a time

average of functions, which makes the problem easier to solve. Thus, we can refine (2)

as follows:

min − f (ϕϕϕ) (6a)

subject to ϕϕϕ(t)≤ x(t), (6b)

g(ϕϕϕ)≤ 0, (6c)

i(ϕϕϕ) = 0, (6d)

ϕϕϕ ∈ R , (6e)

(2e), (2f).

To meet (6b) we introduce a virtual queue vector Y (t) as

Yk(t + 1) = max [Yk(t)+ϕk(t)− xk(t), 0], ∀ k ∈ K . (7)

Next, two virtual queues G(t) and I(t) are defined to replace the inequality constraint

(6c) and the equality constraint (6d), respectively, which are given by

G(t + 1) = max [G(t)+ g(ϕϕϕ(t)), 0]. (8)

I(t + 1) = I(t)+ i(ϕϕϕ(t)). (9)

2.1.3 Lyapunov optimization

Lyapunov drift-plus-penalty technique

The queue backlog vector is defined as ΞΞΞ(t) = [Q(t) ,Y(t) ,G(t) ,I(t)], which involves

constraints (2e) and (6c)-(6e) of the transformed problem (6). Hence, for given ϕϕϕ ∈ R

and α(t) ∈ Aβ (t), the stability of ΞΞΞ(t) yielding all constraints of (6) are held. The

main idea of the Lyapunov optimization is to choose actions, which maximise/minimise

the objective function with respect to the stability of the queues. Here, the Lyapunov

function is written as

33

L(ΞΞΞ(t)) =1

2

[K

∑k=1

(Qk(t)

2 +Yk(t)2)+G(t)2 + I(t)2

]

(10)

For each time slot t, ∆(ΞΞΞ(t)) denotes the Lyapunov drift, which is given by

∆(ΞΞΞ(t)) =E [L(ΞΞΞ(t + 1))−L(ΞΞΞ(t)) |ΞΞΞ(t)] (11)

The solution of (6) is obtained by minimizing the Lyapunov drift and a penalty from the

objective function, given the existing ΞΞΞ(t) and observing β (t) for all t

min ∆(ΞΞΞ(t))−ν ∗E [ f (ϕϕϕ) |ΞΞΞ(t)] . (12)

Here ν is non-negative constant to control the optimal minimization solution. Noting

that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive number a,b, and

thus, by neglecting the index t we have:

(max [Qk − xk, 0]+ ak)2 −Q2

k ≤ 2Qk(ak − xk)+ (ak − xk)2,

max [Yk +ϕk − xk, 0]2 −Y 2k ≤ 2Yk(ϕk − xk)+ (ϕk − xk)

2,

max [G+ g(ϕϕϕ), 0]2 −G2 ≤ 2Gg(ϕϕϕ)+ g(ϕϕϕ)2,

[I+ i(ϕϕϕ)]2 − I2 ≤ 2Ii(ϕϕϕ)+ i(ϕϕϕ)2.

Now, the objective function of (12) is rewritten as

∆(ΞΞΞ(t))−ν ∗E [ f (ϕϕϕ) |ΞΞΞ(t)]≤Ψ+K

∑k=1

Qk (t)E [ak (t)− xk (t) |ΞΞΞ(t)]

+K

∑k=1

Yk (t)E [ϕk(t)− xk (t) |ΞΞΞ(t)] (13)

+G(t)E [g(ϕϕϕ(t))|ΞΞΞ(t)]+ I(t)E [i(ϕϕϕ(t))|ΞΞΞ(t)] ,

where Ψ is a finite constant that satisfies Ψ≥ 1

2

K

∑k=1

E

[(ak(t)−xk(t)

)2|ΞΞΞ(t)]+

1

2

K

∑k=1

E

[(ϕk(t)−

xk(t))2|ΞΞΞ(t)

]+

1

2E

[g(ϕϕϕ(t))2 + i(ϕϕϕ(t))2|ΞΞΞ(t)

], for all t and all possible ΞΞΞ(t).

34

Determining control variables

Note that the solution to (6) is acquired by minimizing the right-hand side (RHS) of (13)

without constant Ψ in every slot t [77]. Finally, we arrange the variables and decouple

the problem into several sub-problems and update the queues accordingly.

Sub-problem 1: Select the auxiliary variables to minimize:

minK

∑k=1

Y (t)ϕk(t)+G(t)g(ϕϕϕ(t))+ I(t)i(ϕϕϕ(t))−ν ∗ f (ϕϕϕ(t)) (14a)

subject to ϕϕϕ ∈ R . (14b)

Sub-problem 2: Choose the actions to satisfy:

minK

∑k=1

−[Qk(t)+Yk(t)

]xk(α(t),β (t)) (15a)

subject to α(t) ∈ Aβ (t),∀t. (15b)

2.2 A successive convex approximation technique

Consider the following non-convex optimization problem:

min f (x) (16a)

subject to g(x)≤ 0, (16b)

x ∈ R , (16c)

where f (·) is assumed to be a twice differentiable, convex function for all x ∈ R . g(·)is a continuous non-convex function for all x ∈ R . We first assume that the non-convex

function has its upper convex approximation function, i.e,

g(x)≤ G(x,y), (17)

where G(x,y) is a convex and continuously differentiable function for x ∈ R and a fixed

parameters y ∈ R .

The main idea of the successive convex approximation technique is to replace the non-

convex function via its proper upper bound for some appropriately chosen parameter

vector y [78]. We require the convex upper bound to satisfy the following properties:

35

Property 2.1. For a given x ∈ R , at every iteration i there exists y(i) := ψ(x(i)) that

satisfies

g(x) ≤ G(x,y(i)), (18a)

g(x(i)) = G(x(i),y(i)), (18b)

∇g(x(i)) = ∇G(x(i),y(i)), (18c)

where ∇g(·) is the gradient of g(·).

In Property 2.1, (18b) and (18c) guarantee that the Karush-Kuhn-Tucker (KKT) op-

timality conditions are satisfied by the convergence points [78]. Moreover, (18a) and

(18b) ensure the feasibility of the iterates and the monotonicity of the objective function.

At each iteration i, for a given starting point x0 ∈ R , which is feasible to (16), by setting

y(i) := ψ(x0), we arrive at the following convex problem:

min f (x) (19a)

subject to G(x,y(i))≤ 0, (19b)

x ∈ R . (19c)

We denote x⋆ as the optimal solution of (19), which is also feasible for (16) due to the

conditions in (18). Thus, x⋆ is used for the feasible point for the next iteration i := i+1.

We set y(i+1) := ψ(x⋆) and x(i+1) := x⋆, and iteratively solve (19) until the convergence

condition is achieved.

We note that f (x⋆) ≤ f (x(i)) for all iterations i, hence, the SCA method produces

a sequence of feasible solutions whose values are monotonically decreasing. The algo-

rithm converges when it is bounded below by a finite limit [78].

2.3 Random matrix theory

In this section, a brief introduction to random matrix theory is provided to deterministi-

cally approximate high dimensional random processes which requires only the knowl-

edge of statistic channel correlation matrices ΘΘΘm. In the context of massive MIMO sys-

tems serving large number of users, the wireless channel propagation is often modeled

as a large random matrix, and thus, random matrix theory (RMT) provides a powerful

tool to characterize the network performance in diverse MIMO scenarios [41, 79].

The author starts by revisiting some important Lemmas when studying large dimen-

sional random matrices as follows:

36

Lemma 2.1. [Matrix inversion] Let H be an N×N invertible matrix and x ∈ CN , c ∈ C

for which H+ cxx† is invertible. We have

x†(H+ cxx†

)−1=

x†H−1

1+ cx†H−1x. (20)

Lemma 2.2. [Resolvent identity] Let H and W be two invertible complex matrices of

size N ×N. We have

H−1 −W−1 =−H−1 (H−W)W−1. (21)

Lemma 2.3. Let A1,A2, · · · , with AN ∈ CN×N be a series of random matrices gen-

erated by the probability space (Ω,F ,P) such that, for w ∈ A ⊂ Ω, with P(A) = 1,

‖AN (w)‖ < K (w), uniformly on N. Let x1,x2, · · · , with xN ∈ CN , be random vectors

of i.i.d. entries with a zero mean, a variance 1/

N, and eighth-order moment of order

O(1/

N4), independent of AN . Then

x†NANxN − 1

NTrAN

N→∞−−−→ 0 (22)

almost surely.

Lemma 2.4. Let AN be as in Lemma 2.3 and let xN ,yN ∈ CN be random, mutually

independent with standard i.i.d. entries of zero mean, with a variance of 1/

N, and

eighth-order moment of order O(1/

N4), independent of AN . Then

y†NANxN

N→∞−−−→ 0 (23)

almost surely.

Lemma 2.5. Let A1,A2, · · · , with AN ∈CN×N be deterministic with a uniformly bounded

spectral norm and let B1,B2, · · · , with BN ∈ CN×N , be a random Hermitian, with eigen-

values λ BN1 6 · · ·6 λ BN

N such that, with a probability of 1, there exist ε > 0 for λ BN1 > ε

for all large N. The for v ∈ CN

1

NTrANB−1

N − 1

NTrAN

(BN + vv†

)−1 N→∞−−−→ 0 (24)

almost surely, where B−1N and

(BN + vv†

)−1exist with a probability of 1.

Next, Theorem 2.1 is introduced to deterministically approximate the random ma-

trices, which resulting in a closed-form expression.

37

Theorem 2.1. [A deterministic approximation of random matrix]

Let BN =X†NXN +SN with SN ∈C

N×N Hermitian nonnegative definite and XN ∈Cn×N

random. The ith column xi of XH

N is xi=ΨΨΨiyi, where the entries of yi∈Cri are i.i.d. of the

zero mean, a variance of 1/N. The matrices ΨΨΨi∈CN×ri are deterministic. Furthermore,

let ΘΘΘi=ΨΨΨiΨΨΨH

i ∈CN×N and define QN∈CN×N deterministic. Assume limsupN→∞ sup1≤i≤n‖ΘΘΘi‖<∞ and let QN have uniformly bounded spectral norm (with respect to N). We define the

random matric identity, which is approximated later as follows:

mBN ,QN(z),

1

NTrQN (BN − zIN)

−1 . (25)

Under the assumptions that, for z∈C \R+, as n,N grow large with ratios βN,i ,N/ri

and βN , N/n such that 0 < liminfN βN,i ≤ limsupN βN,i < ∞ and 0 < liminfN βN ≤limsupN βN <∞, we get the closed-form expression of (25) as

mBN ,QN(z)−m

BN ,QN(z)

N→∞−→ 0, (26)

almost surely, with mBN ,QN

(z) given by

mBN ,QN

(z)=1

NTrQN

(

1

N

n

∑j=1

ΘΘΘ j

1+eN, j(z)+SN−zIN

)−1

(27)

where the functions eN,1(z), . . . ,eN,n(z) form the unique solution

eN,i(z) =1

NTrΘΘΘi

(

1

N

n

∑j=1

ΘΘΘ j

1+eN, j(z)+SN−zIN

)−1

(28)

which is the Stieltjes transformation of a nonnegative finite measure on R+. Moreover,

for z<0, the scalars eN,1(z), . . . ,eN,n(z) are the unique nonnegative solutions to (28).

2.4 Reinforcement learning

Reinforcement learning is an area of machine learning in which agents perform actions

to interact with the environment so as to maximize the cumulative reward [80]. By

evaluating feedback from theirs own actions and experiences, the agents determine a

sequence of best actions which maximize the long-term reward.

Basically, reinforcement learning is concerned with decision making to enable the

adaptation and self-organization, and the agents spend time discovering actions to find

the best strategies, then exploit them in the long run. At each time slot t, each agent

38

selects an action from a possible action set, the agent observes the environment and

experiences the reward as shown in Fig. 4. In the next time slot t+1, the agent evaluates

the decision, which is made from the previous time slot and the agent selects the action

based on the distribution of the action-reward. Here, the concept of regret strategy

is employed, defined as the difference between the average utility when choosing the

same actions in previous times, and its average utility obtained by constantly selecting

different actions. The premise is that regret should be minimized over time so as to

choose the best sequence of actions.

Agent

Action (t)Observation

Environment

Reward (t)

NewState(t+1)

t t+ 10 T − 1 T

New State

Feedback

Uplink training phase Downlink transmission phase Uplink transmission and feedback phase

Time indices for each Episode

t t+ 1

:::

Episode 1

Episode 2

Episode 3

NLOS

LOS

NLOS Episode representation for simulation

:::

Fig. 4. Reinforcement learning model ([73] c©2018 IEEE).

The important elements of reinforcement learning include agents, actions, reward

function, policy and environment, which are briefly described as follows:

– Agents can be network operators, base stations, or users, who want to maximize their

cumulative reward functions.

– Actions are defined as a set of things that agents do to solve their concerns with the

environments. In the context of resource allocation, actions could consist of user

association, power assignment, or beamwidth selection.

– Reward function is defined as the cumulative return for the agent after applying

selected actions to the environment. Network utility function and power consumption

are common metrics used to measure the reward.

39

– Policy refers to strategies that the agents play to determine next action based on the

distribution of actions-rewards. It is a mapping between action and state. Here, a

state is the current condition of the environment such as the channel state, or network

queuing state.

– The environment contains the network system, where the agents play their actions

to maximize the reward. At the beginning of each time slot, the agents observe the

reward, which reflects the noise and interference in the environment.

40

3 Integrated access and backhaul architecture

From this chapter, each research question is answered sequentially. In particular, this

chapter addresses the first question Q1 by proposing an integrated access-backhaul

(IAB) framework to dynamically schedule users, while efficiently providing a wireless

backhaul to dense small cells and mitigating interference. In addition, joint resource al-

location and interference mitigation solutions are proposed for two-hop self-backhauled

networks.

3.1 Main contributions and related work

The main contributions are lised as follows

– The problem of joint load balancing (user association and user scheduling) and inter-

ference management (beamforming design and power allocation) for 5G HetNets is

modelled in which a DL scheduler is designed at the MBS to schedule macro UEs and

provide a backhaul to FD-enabled SCs, with FD capable SCs serve both MUEs and

small cell UEs in the same frequency band. Moreover, an interference management

scheme is proposed to mitigate both co-tier and cross-tier interference from the MBS

and FD-enabled SCs by designing a hierarchical precoding scheme and controlling

the transmission of the SCs. The problem is cast as a network utility maximization

(NUM) problem subject to dynamic wireless backhaul constraints, traffic load, and

imperfect channel state information (CSI). To make the problem tractable, by invok-

ing results from random matrix theory (RMT), we derive a closed-form expression

of the signal-to-interference-plus-noise-ratio (SINR) and transmit power when the

numbers of MBS antennas and users grow very large.

– A Lyapunov framework is applied to solve the NUM problem in polynomial time.

The NUM problem is decomposed into the dynamic scheduling of MUEs, as well as

the backhaul provisioning of FD-enabled SCs, and offloading MUEs to FD-enabled

SCs. The joint load balancing and operation mode (FD or half-duplex) subproblem,

which is a non-convex program with binary variables, is converted into a convex

program by using the successive convex approximation (SCA) method. The motiva-

tions for using a SCA are its low complexity and fast convergence, and the obtained

solution, which yields many relaxed variables is close to zero or one.

41

– A performance evaluation is carried out to compare the proposed algorithm with other

baselines under the impact of SC density and transmit power levels at low/high fre-

quency bands. A comprehensive performance analysis of our proposed algorithm

based on the Lyapunov framework is provided in Appendix 1. There exists an

[O(1/ν),O(ν)] utility-queue backlog tradeoff, which leads to an utility-latency bal-

ancing [72], where ν is the Lyapunov control parameter. Moreover, a convergence

analysis of the approximation method based on the SCA method is studied.

Related work

An overview of cellular backhaul technologies and identified design and challenges was

studied in [54]. Recent work on the mmWave access and backhauling for 5G commu-

nication systems is discussed in [52]. The Xhaul architecture presented in [52] aims

to develop a 5G integrated backhaul and fronthaul transport network enabling flexible

and software-defined reconfiguration of all networking elements in a multi-tenant and

service-oriented unified management environment. As pointed out in [67, 68], the cur-

rent solutions for user association problems ignore the backhaul constraints, which are

crucial since the capacity of open access SCs with either wired or wireless backhaul

always face the limited backhaul constraint.

Moreover, the load balancing problem should take imperfect CSI into account due

to mobility, which is ignored in the previous work. Our previous work in [74] consid-

ered the problem of joint in-band scheduling and interference mitigation in 5G HetNets

without considering the user association. In this chapter, we extend [74] by considering

the load balancing problem taking into account the backhaul constraint and imperfect

CSI, and further this chapter provides insights into the performance analysis of our

proposed algorithm based on the Lyapunov framework and convergence of the SCA

method.

The rest of this chapter is organized as follows. Section 3.2 describes the system

model and Section 5.3 provides the problem formulation for load balancing and inter-

ference mitigation. Section 5.4 introduces the Lyapunov framework used to solve our

problem. In Section 5.5, we present the numerical results. We conclude the chapter in

Section 3.6.

42

3.2 System model

The downlink (DL) transmission of a HetNet scenario is considered as shown in Fig. 1

in which a MBS b0 is underlaid with a set of uniformly deployed S FD-enabled SCs,

S = bs|s ∈ 1, . . . ,S. Let B = b0∪ S denote the set of all base stations, where

|B|= 1+S. The MBS is equipped with N number of antennas and serves a set of single-

antenna M MUEs M = 1, . . . ,M. Let K = M ∪ S denote the set of users associated

with MBS b0, where |K | = K = M+ S. The user indices k = 1,2, ...,M represent the

corresponding MUE indices m= 1,2, ...,M, while the user indices k = M+1,M+2, ...,

M+S represent the corresponding SC indices s = 1,2, ...,S. We assume an open access

policy at the FD-enabled SCs and each FD-enabled SC is assumed to be equipped with

Ns + 1 antennas: one receiving antenna is used for the wireless backhaul and Ns trans-

mitting antennas to serve its single-antenna small cell UEs (SUEs) or other MUEs at the

same frequency band. Let C = c1,c2, . . . ,cS denote the set of SUEs, where |C | = S.

Moreover, the SCs are assumed to be FD capable with perfect self-interference can-

celation (SIC) capabilities [31, 33, 81]. A co-channel time-division duplexing (TDD)

protocol is considered in which the MBS and FD-enabled SCs share the entire band-

width, and the DL transmission occurs at the same time. In this work, we consider

a large number of antennas at both the macro and SC BSs and a dense deployment

of MUEs and SCs, such that M,N,Ns,S ≫ 1. We denote h(b0)m =

[h(b0,1)m ,h

(b0,2)m , · · · ,

h(b0,N)m

]T ∈ CN×1 as the propagation channel between the mth MUE and the antennas

of the MBS b0 in which h(b0,n)m is the channel between the mth MUE and the nth MBS

antenna. Let H(b0),M =[h(b0)1 ,h

(b0)2 , · · · ,h(b0)

M

]∈ CN×M denote the channel matrix be-

tween all MUEs and the MBS antennas. Moreover, we assume imperfect CSI for the

MUEs due to mobility and we denote H(b0),M =[h(b0)1 , h

(b0)2 , · · · , h(b0)

M

]∈ C

N×M as the

estimate of H(b0),M in which the imperfect CSI can be modeled as [16]:

h(b0)m =

√

NΘΘΘ(b0)m w

(b0)m , (29)

where w(b0)m =

√

1− τm2w

(b0)m +τmz

(b0)m is the estimate of the small-scale fading channel

matrix and ΘΘΘ(b0)m is the spatial channel correlation matrix that accounts for the path loss

and shadow fading. Note that due to limited spatial scattering in the MIMO channel, the

rank of the correlation matrix is much small than number of antennas, i.e., Rank(X)≤N.

While the spatial channel model is clustered, which belongs to a finite set with a finite

size [25]. Here, w(b0)m and z

(b0)m are the real channel and the channel noise, respectively,

modelled as a Gaussian random matrix with zero mean and variance 1/N. The channel

43

estimate error of MUE m is denoted by τm,τm ∈ [0,1]; in case of perfect CSI, τm = 0.

Similarly, let H(b0),S ∈ CN×S and H(b0),C ∈ C

N×S denote the channel matrices from the

MBS antennas to the SCs and SUEs, respectively. Let h(bs)u ∈ CNs×1 denote the channel

propagation from SC bs to any receiver u. Let cs denote the SUE served by the SC bs.

3.3 Load balancing and interference mitigation

In this section, we formulate the joint optimization of user association, user scheduling,

beamforming design, and power allocation. To that end, we first derive the received

signal, data rate, and power transmit for each receiver (SCs are also treated as macro

BS’s UEs). We then formulate the problem as a network utility maximization subject to

wireless backhaul constraints. However, the formulated problem does not have closed-

form expressions for the objective and constraints. Hence, we apply RMT [41] to obtain

these closed-form expressions. We finally utilize the tool of stochastic optimization to

decouple our problem into several solvable sub-problems.

The problem of user scheduling and user association for load balancing in the DL is

addressed in which the MBS simultaneously provides data transmission to MUEs and

wireless backhaul to the FD-enabled SCs, while the SCs with an FD capability serve

both SUEs and MUEs. For each MUE m ∈ M , let binary variable l(bs)m indicate the

transmission association from BS bs ∈ B to MUE m, i.e., l(bs)m = 1 when MUE m is

associated with BS bs, otherwise l(bs)m = 0. Similarly, let binary variables l

(b0)s+M and l

(bs)cs

denote the transmission association indicators from MBS b0 to SC s and from SC bs to

SUE cs, respectively. We assume that each MUE m connects to one BS (either MBS

b0 or SC bs) at time slot t. Each SC is equipped with Ns transmitting antennas, and we

assume that each SC serves up to Naus active users (either SUE or MUE) at each time

slot, such that Naus ≤ Ns, where the superscript au stands for “active users". Hence, we

have the following constraints for load balancing:

∑Ss=0 l

(bs)m ≤ 1,∑M

m=1 l(bs)m + l

(bs)cs ≤ Nau

s ,∀ s,m ∈ K . (30)

We define vector l =

l(bs)j |bs ∈ B, j ∈ M ∪ S ∪C

containing all transmission indi-

cators between BSs and UEs. Let Ntxs = ∑M

m=1 l(bs)m + l

(bs)cs be the total number of the

transmissions at SC, where superscript tx stands for “transmissions", and thus the latter

of (30) becomes Ntxs ≤ Nau

s ,∀s ∈ S .

44

3.3.1 Downlink transmission

The MBS serves two types of users: MUEs with imperfect CSI and FD-enabled SCs

with perfect CSI. Let p(b0)m , p

(b0)s+M, and P(b0) denote the DL MBS transmit power as-

signed to MUE m, the DL MBS transmit power assigned to SC s, and the maximum

transmit power at the MBS, respectively. We focus on the multiple-input single-output

(MISO) channel, where the MBS with N antennas can serve K UEs. Here, we take into

account user scheduling and association, and our proposal can apply to any special case

when number of UEs is larger than number of antennas, i.e., K > N. SC exploits FD

capability to double capacity, FD-enabled SC causes unwanted FD interference: this

results in cross-tier interference to the adjacent MUEs (or other SCs), and co-tier inter-

ference to other UEs. Hence, in order to convert the interference channel to the MISO

channel, we design a precoder at the MBS and propose an operation mode policy to

control the FD interference to treat the total FD interference as additional noise.

Definition 3.1. [Operation Mode Policy] We define φφφ as the operation mode to control

the FD-enabled SC transmission to reduce FD interference. The operation mode is

expressed as φφφ(t) = φ (bs)(t) | φ (bs)(t) ∈ 0,1,∀s ∈ S. Here, φ (bs)(t) = 1 indicates

SC bs operates in FD mode and φ (bs)(t) = 0 for half-duplex (HD) mode.

We assume that the MBS uses a precoding scheme, V = [v1,v2, . . . ,vK] ∈ CN×K.

To exploit the degrees of freedom of massive MIMO, the hierarchical interference

mitigation scheme in [25, 82] is applied to design the precoder, i.e., V = UT, where

T ∈ CN×Nitf is used to control the co-tier interference and capture the spatial multiplex-

ing gain, and U ∈ CNitf×K is used to suppress cross-tier interference. Here, Nitf < N,

where the subscript itf stands for “interference". The precoder U is chosen such that

U† ∑Ss=1 φ (bs)ΘΘΘ(b0)

s = 0, (31)

where ΘΘΘ(b0)s ∈ CN×N is the sum of the correlation matrices between the MBS anten-

nas and the users belong to SC s. Here, U is in the null space of ∑Ss=1 φ (bs)ΘΘΘ(b0)

s .

Note that φ (bs) determines that the transmission of the FD-enabled SC is enabled or

not. The precoder T is designed to adapt to the real time CSI based on H†U ∈ CK×Nitf ,

where H = [h(b0)]†k∈K . In this chapter, we consider the regularized zero-forcing (RZF)

precoding1 that is given by T =(U†H†HU+Nζ INitf

)−1U†H†, where the regulariza-

tion parameter ζ > 0 is scaled by N to ensure that the matrix U†H†HU + Nζ INitf

1Other (hybrid) precoders are left for future work, the analogue beamforming design is not introduced in this

chapter, and we assumed that the analogue beamforming gain is normalized to one.

45

is well conditioned as N → ∞. The precoder T is chosen to satisfy the power con-

straint Tr(PT†T

)≤ P(b0), where P = diag(p

(b0)1 , p

(b0)2 , . . . , p

(b0)K ). We also assume that

each SC uses ZF precoding to server its users, F(bs) = [f(bs)1 , f

(bs)2 , . . . , f

(bs)

Ntxs] ∈ CNs×Ntx

s

which reads f(bs)u = h

(bs)†u

(h(bs)u h

(bs)†u

)−1such that F(bs) is chosen to satisfy the equal-

ity power constraint Tr(P(bs)F(bs)†F(bs)

)= P(bs)2. Here, P(bs) = diag(p

(bs)1 , p

(bs)2 , . . . ,

p(bs)

Ntxs). The channel propagation from the SC bs to the MUE m (referred to as user u) is

h(bs)u = h

(bs)m =

√

NsΘΘΘ(bs)m

(√

1− τm2w

(bs)m + τmz

(bs)m

), where ΘΘΘ(bs)

m ∈ CNs×Ns is the chan-

nel correlation matrix. Here, w(bs)m and z

(bs)m are the real channel and the channel noise

from SC bs to MUE m, respectively, modelled as a Gaussian random matrix with a zero

mean and a variance of 1/Ns.

By utilizing a massive number of antennas at the MBS, a large spatial degree of

freedom is utilized to serve MUEs and FD-enabled SCs, while the remaining degrees

of freedom are used to mitigate cross-tier interference. In a massive MIMO system,

the total number of antennas is considered as the degree of freedom [25]. Hence,

we have the antenna constraint for user association and an operation mode such that

∑Kk=1 l

(b0)k (t)+∑S

s=1 Ntxs (t) ≤ N. For notational simplicity, we remove the time depen-

dency from the symbols throughout the discussion. The received signal y(b0)m at each

MUE m ∈ M at time instant t is given by

y(b0)m = l

(b0)m

√

p(b0)m h

(b0)†m vmx

(b0)m

+S

∑s=1

φ (bs) ∑Ntx

su=1 l

(bs)u

√

p(bs)u h

(bs)†m f

(bs)u x

(bs)u

︸︷︷︸

cross-tier interference

+K

∑k=1,k 6=m

l(b0)k

√

p(b0)k h

(b0)†m vkx

(b0)k

︸︷︷︸

co-tier interference

+ηm,

(32)

where x(b0)m is the signal symbol from the MBS to the MUE m, vm is the precoding

vectors of MUE m, and ηm ∼ CN (0,σ2) is the thermal noise at MUE m. While x(bs)u is

the transmit signal symbol from SC bs to its user u.

2We chose the equality constraints for the transmit power at the SCs to reach the optimal rate at maximum

power rather than using Tr(P(bs)F(bs)†F(bs)

)≤ P(bs), since the power at the SCs is relatively small.

46

At time instant t, the received signal y(b0)s+M at each SC s ∈ K suffers from self-

interference, as well as cross-tier and co-tier interference, which is given by

y(b0)s+M = l

(b0)s+M

√

p(b0)s+Mh

(b0)†s+M vs+Mx

(b0)s+M

+S

∑s′=1,s′ 6=s

φ (bs′ )Ntx

s′∑

u′=1

l(bs′ )u′

√

p(bs′ )u′ h

(bs′ )†s f

(bs′ )u′ x

(bs′ )u′

︸︷︷︸


+φ (bs)Ntx

s

∑u=1

l(bs)u

√

p(bs)u h

(bs)†s f

(bs)u x

(bs)u

︸︷︷︸

self-interference

+K

∑k=1,k 6=s+M

l(b0)k

√

p(b0)k h

(b0)†s+M vkx

(b0)k

︸︷︷︸


+ηs+M,

(33)

where x(b0)s+M is the signal symbol from the MBS to the SC s, vs+M are the precoding

vectors of SC s, and ηs+M ∼ CN (0,σ2) is the thermal noise of the SC s.

The received signal from the SC bs at receiver u, y(bs)u = 0, if the SC bs operates in

HD mode, φ (bs) = 0. For FD mode, φ (bs) = 1, the received signal y(bs)u is given by

y(bs)u = φ (bs)l

(bs)u

√

p(bs)u h

(bs)†u f

(bs)u x

(bs)u

+S

∑s′=1,s′ 6=s

φ (bs′ )Ntx

s′∑

u′=1,l(bs′ )u′

√

p(bs′ )u′ h

(bs′ )†u f

(b′s)u′ x

(bs′ )u′

︸︷︷︸


+φ (bs)Ntx

s

∑j=1, j 6=u

l(bs)j

√

p(bs)j h

(bs)†u f

(bs)j x

(bs)u

︸︷︷︸

co-tier self-interference

+K

∑k=1,k 6=u

l(b0)k

√

p(b0)k h

(b0)†u vkx

(b0)k

︸︷︷︸


+ηu,

(34)

where x(bs)u is the transmit data symbol from the SC bs to receiver u and ηu ∼ CN (0,σ2)

is the thermal noise at receiver u. We imply that the receiver u can be either a SUE or

an MUE.

The precoder V is designed at the MBS to null the co-tier interference and to com-

pletely remove the cross-tier interference to SCs’s users (31) and the self-interference

47

γ(b0)m =

l(b0)m p

(b0)m |h(b0)†

m vm|2

∑k 6=m l(b0)k p

(b0)k |h(b0)†

m vk|2 +∑s φ (bs)P(bs)|h(bs)†m |2 +σ2

. (35)

γ(b0)s+M =

l(b0)s+M p

(b0)s+M|h(b0)†

s+M vs+M|2

∑k 6=s+M l(b0)k p

(b0)k |h(b0)†

s+M vk|2 +∑s′ 6=s φ (bs′ )P(bs′ )|h(bs′ )†s |2 +σ2

. (36)

γ(bs)u =

φ (bs)l(bs)u p

(bs)u |h(bs)†

u f(bs)u |2

φ (bs) ∑ j=1, j 6=u l(bs)j p

(bs)j |h(bs)†

u f(bs)j |2 +∑s′ 6=s φ (bs′ )P(bs′ )|h(bs′ )†

u |2 +σ2. (37)

is well treated, while Tr(P(bs)F(bs)†F(bs)

)= P(bs). Thus, according to (32)-(34), the

SINRs of an MUE m served by an MBS, an SC s served by an MBS, a receiver u served

by an SC are given in (35)-(37), respectively.

3.3.2 Joint load balancing and interference mitigation

Let us consider a joint optimization of load balancing l, operation mode φφφ , interference

mitigation U, and transmit power allocation p = (p(b0)1 , p

(b0)2 , . . . , p

(b0)K ) that satisfies the

transmit power budget of MBS i.e. , Tr(PT†T

)≤ P(b0). We define k

(bs)k =

P(bs)|h(bs)†k

|2σ 2

and ko as the FD interference to noise ratio (INR) from an FD-enabled SC bs to any

scheduled receiver k, and the allowed FD INR threshold, respectively. The FD interfer-

ence threshold is defined such that ∑Kk=1 ∑S

s=1k(bs)k ≤ ko, so that the total FD interfer-

ence is considered as noise. Under the operation mode policy, we schedule the receiver

i and enable the transmission of SC bs as long as ∑Kk=1 ∑S

s=1 l(b0)k φ (bs)k

(bs)k ≤ ko. Let

ΛΛΛo = l,φφφ be a composite control variable of user association and operation mode.

We define ΛΛΛ = ΛΛΛo,U,p as a composite control variable, which adapts to the spatial

channel correlation matrix ΘΘΘ.

For a given ΘΘΘ that satisfies (31) and operation mode policy, the respective Ergodic

data rates of SC s and SUE u are rs+M(ΛΛΛ|ΘΘΘ) = E[

log(1+ γ

(b0)s+M

)]and r

(bs)u (ΛΛΛ|ΘΘΘ) =

E

[log(1+ γ

(bs)u

)]. While from the constraint (30) the Ergodic data rate of MUE m will

depend on which BS the MUE is associated with, i.e., rm(ΛΛΛ|ΘΘΘ) =E[

log(1+ γ

(b0)m

)]+

S

∑s=1

minE[

log(1+ γ

(bs)m

)], rs(ΛΛΛ|ΘΘΘ)− ∑

u 6=m

r(bs)u (ΛΛΛ|ΘΘΘ). In other words, the first term

is the data rate from from the MBS to MUE when the MUE is associated with the

MBS, while the second term is when the FD-enabled SCs allow the MUE to connect (If

48

the MUE is connected to the FD-enabled SC, then the rate of the MUE should be the

minimum between r(bs)m (ΛΛΛ|ΘΘΘ) and data stream from the MBS via FD-enabled SC to the

MUE, excepts other SC’s users).

For a given composite control variable ΛΛΛ that adapts to the spatial channel correla-

tion matrix ΘΘΘ, the average data rate region is defined as the convex hull of the average

data rate of users, which is expressed as:

R ,

r(ΛΛΛ|ΘΘΘ) ∈ RK+ | l ∈ 0,1K+MS+S,φφφ ∈ 0,1S,

∑Ss=0 l

(bs)m ≤ 1, ∀ m ∈ M ,

∑Mm=1 l

(bs)m + l

(bs)cs = Ntx

s ,Ntxs ≤ Nau

s , ∀ bs ∈ S ,

∑Kk=1 l

(b0)k +∑S

s=1 Ntxs ≤ N,

∑Kk=1 ∑S

s=1 l(b0)k φ (bs)k

(bs)k ≤ ko,

Tr(PT†T

)≤ P(b0), U† ∑S

s=1 φ (bs)ΘΘΘ(b0)s = 0

,

where r(ΛΛΛ|ΘΘΘ) = (r1(ΛΛΛ|ΘΘΘ), . . . , rK(ΛΛΛ|ΘΘΘ))T . Following the results from [83], the bound-

ary points of the rate regime with total power constraint and no self-interference are

Pareto-optimal3. Moreover, according to [84, Proposition 1], if the INR covariance ma-

trices approach the identity matrix, the Pareto rate regime of the MIMO interference

system is convex. Hence, our rate regime is a Pareto-optimal, and thus is convex with

the above constraints.

Let us assume that each FD-enabled SC acts as a relay to forward data to its users.

If the MBS transmits data to an FD-enabled SC bs, but the transmission of SC bs is

disabled, it cannot serve its SUE. Hence, we define D(t) = (D1(t),D2(t), . . . ,DS(t))

as a data queue at the SCs, where at each time slot t, the wireless backhaul queue at

FD-enabled SC bs is

Ds(t + 1) = max[Ds(t)+ rs+M(t)− r(bs)cs (t), 0], ∀ s ∈ S . (38)

The SC offloads some MUEs from the MBS if the wireless backhaul capacity between

the SCs and the MBS is guaranteed, and hence, for each SC we have the following

wireless backhaul condition for all t ≥ 0: “If the access link between the MUE m and

the MBS is better than the link between the MUE m and the SCs, then the MUE connects

3The Pareto optimal is the set of user rates at which it is impossible to improve any of the rates without

simultaneously decreasing at least one of the others.

49

with the MBS rather than with other SCs", i.e.,4

if rs+M(t)≤ r(b0)m (t), then ł

(bs)m = 0, ∀s ∈ S ,m ∈ K. (39)

We define the network utility function f0(·) to be non-decreasing and concave over

the convex region R for a given ΘΘΘ. The objective is to maximize the network utility

under wireless backhaul constraints and imperfect CSI. Thus, the NUM problem is

given by,

OP1:maxr

f0(r) (40a)

subject to (39), r ∈ R , D < ∞, (40b)

where f0(r) = ∑Kk=1 ωk(t) f (rk) with ωk(t) ≥ 0 is the weight of user k, f (·) is assumed

to be twice differentiable, concave, and increasing L-Lipschitz function for all r ≥ 0.

Solving (40) is non-trivial since the average rate region R does not have a tractable

form. To overcome this challenge, we need to find closed-form expressions of the data

rate and the average transmit power. Inspired by [41], we invoke RMT to get the closed-

form expressions for the user data rate and transmit power as N ≫ K.

3.3.3 Closed-form expression via a deterministic equivalent

We invoke recent results from RMT to get the deterministic equivalent of the user rate

and transmit power via Theorem 3.1.

Theorem 3.1. Recall that ζ is the RZF parameter. As N ≫ K; N,K → ∞, by applying

the technique in [41, Theorem 2], the deterministic equivalent of the asymptotic SINR

of MUE m is

γ(b0)m

a.s.−−→ l(b0)m p

(b0)m (1− τ2

m)(Ωm)2

Φ,

wherea.s.−−→ denotes the almost sure convergence and Φ=ϒm

[

ζ 2−τ2m

(ζ 2−(ζ +Ωm)

2)]

+

(ζ +Ωm)2(σ2 +∑S

s=1 φ (bs)k(bs)m ). Here, Ωm = 1

NTr(ΘΘΘmG) forms the unique positive so-

lution of which is the Stieltjes is a transformation of nonnegative finite measure [41, The-

orem 1], where G=(

1N ∑K

k=1ΘΘΘk

ζ+Ωk+INitf

)−1

. In addition, ϒm = 1N ∑K

k=1,k 6=m

ζ 2l(b0)k

p(b0)k

ekm

(ζ+Ωk)2 ,

4The queues of MUEs are handled at the MBS and the SCs strictly handle data for SUEs. Hence when the

SCs open a connection for the MUEs, they should have immediate capacity in terms of data rate. We do not

include the constraint (39) for the closed access case in [74].

50

and ΘΘΘk = UU†ΘΘΘ(b0)k UU†. e = [ek],k ∈ K , and em = [emk],k ∈ K are given by e =

(I−J)−1u, ek =(I−J)−1uk, where J= [Ji j], i, j ∈K . u= [uk],k∈K , um = [umk],k ∈K

are given by Ji j =1N

trΘΘΘiGΘΘΘ jG

N(ζ +Ω j)2

, umk =1

ζ 2NtrΘΘΘkGΘΘΘmG, uk =

1ζ 2N

trΘΘΘkG2. Similarly,

the SINR of SC bs is

γ(b0)s

a.s.−−→ l(b0)s p

(b0)s (Ωs)

2

ζ 2ϒs +(ζ +Ωs)2(σ2 +∑Ss′=1,s′ 6=s φ (bs′ )k

(bs′ )s )

.

The power constraint at the MBS can be calculated as 1N ∑K

k=1

p(b0)k

ζ 2ek

(ζ+Ωk)2 − P(b0) ≤ 0.

Moreover, following the analysis in the proof of [41, Theorem 3], [25, Lemma 6] for a

small fixed ζ > 0, ϒk = O(1) and ζ 2ek = Ωk +O(ζ ) yield the deterministic equivalent

of the asymptotic SINRs of UEs (35)-(37) as

γ(b0)m (ΛΛΛ|ΘΘΘ)

a.s.−−→ l(b0)m p

(b0)m (1−τ2

m)

σ 2+∑Ss=1 φ (bs)k

(bs)m

, (41)

γ(b0)s (ΛΛΛ|ΘΘΘ)

a.s.−−→ l(b0)s p

(b0)s

σ 2+∑Ss′=1,s′ 6=s φ (bs′ )k

(bs′ )s

, (42)

γ(bs)u (ΛΛΛ|ΘΘΘ)

a.s.−−→ φ (bs)l(bs)u p

(bs)u

σ 2+∑Ss′=1,s′ 6=s φ (bs′ )k

(bs′ )u

. (43)

Moreover, we obtain the closed-form expression for the transmit power constraint, i.e.,

1

N∑K

k=1

p(b0)kΩ

k−P(b0) ≤ 0.

Although the closed-form expressions of the average data rate and transmit power

are obtained, it is still challenging to solve our predefined problem (40), since (40) con-

siders an optimization of a function of the time-average with a large number of control

variables, and a dynamic traffic load over the convex region for a given composite con-

trol variable ΛΛΛ and ΘΘΘ. The Lyapunov stochastic optimization is a powerful tool to (i)

transform a problem of a function of the time average into a problem of time average

of a function, and (ii) decouple complex problems into several simple sub-problems.

Moreover, our aim is to maximize the aggregate network utility subject to queue stabil-

ity in which the Lyapunov stochastic optimization yields an utility throughput optimal-

ity and stability [77]. Hence, we apply the drift-plus-penalty technique [77] to find the

solutions for load balancing, operation mode selection, and power allocation problems.

51

3.4 Proposed load balancing and interference mitigation

We assume that the network system is modelled as a queueing network that operates in

discrete time t ∈ 0,1,2, . . .. Let ak(t) denote the bursty data arrival destined for each

user k, i.i.d over time slot t. Let Q(t) denote the vector of transmission queue backlogs

at the MBS at slot t. The queue evolution is given by

Qk(t + 1) = max [Qk(t)− rk(t), 0]+ ak(t), ∀ k ∈ K . (44)

Here, we consider the bound of the traffic arrival of user k is bounded so that 0≤ ak(t)≤amax

k , for some constant amaxk < ∞. Furthermore, let rmax

k (t) be the upper bound of the

data rate for user k at time slot t, such that rmaxk (t)≤ amax

k . The set at constraint (40b) is

replaced by an another equivalent set by introducing the auxiliary variables ϕϕϕ(t) ∈ R ,

ϕϕϕ(t) =(ϕ1(t), . . . ,ϕK(t)

)that satisfies ϕk ≤ rk, where ϕk , limt→∞

1t ∑t−1

τ=0E[ϕk(τ)

].

The evolution of the wireless backhaul queue is rewritten as

Ds(t + 1) = max [Ds(t)+ϕs+M(t)− r(bs)cs (t), 0], ∀ s ∈ S . (45)

For a given ΛΛΛ and ΘΘΘ, the optimization problem (40) subject to the network stability and

dynamic backhaul can be posed as

RP1:minϕϕϕ

− f0(ϕϕϕ) (46a)

subject to ϕk − rk ≤ 0, ∀ k ∈ K , (46b)

(39), D < ∞,Q < ∞. (46c)

In order to ensure the inequality constraint (46b), we introduce a virtual queue vector

Y (t) which evolves as follows

Yk(t + 1) = max [Yk(t)+ϕk(t)− rk(t), 0], ∀ k ∈ K . (47)

We define the queue backlog vector as ΞΞΞ(t) =[Q(t),Y(t),D(t)

](whereas the stability

of ΞΞΞ(t) yields all constraints of problem (46) are hold). The Lyapunov function can be

written as

L(ΞΞΞ(t)),1

2

[

∑Kk=1 Qk(t)

2 +∑Kk=1 Yk(t)

2 +∑Ss=1 Ds(t)

2].

For each time slot t, ∆(ΞΞΞ(t)) denotes the Lyapunov drift, which is given by

∆(ΞΞΞ(t)),E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(t))|ΞΞΞ(t)

].

52

[[

Impact of network queue, virtual queue, and ΛΛΛ︷︸︸︷

−∑k

(Qk(t)+Yk(t)

)rk(ΛΛΛ(t))

]

1⋆

Impact of SC queue and φφφ︷︸︸︷

−∑s Ds(t)r(bs)cs (φ (bs)(t))

]

2⋆

+[

Impact of virtual queue, SC queue, and auxiliaries︷︸︸︷

∑k Yk(t)ϕk(t)+∑s Ds(t)ϕs+M(t)

penalty︷︸︸︷

−ν f0(ϕϕϕ(t))]

3⋆. (49)

Noting that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive num-

ber a,b, and thus, by neglecting the index t we have:

(max [Qk − rk, 0]+ ak)2 −Q2

k ≤ 2Qk(ak − rk)+ (ak − rk)2,

max [Yk +ϕk − rk, 0]2 −Y 2k ≤ 2Yk(ϕk − rk)+ (ϕk − rk)

2,

max [Ds +ϕs+M − r(bs)cs (t), 0]2 −D2

s ≤ 2Ds(ϕs+M

− r(bs)cs (t))+ (ϕs+M − r

(bs)cs (t))2.

We assume that ϕϕϕk ∈ R and a feasible l for all t and all possible ΞΞΞ(t), thus we have

∆(ΞΞΞ(t))≤ Ψ+∑Kk=1 Qk(t)E

[

ak(t)− rk(t)|ΞΞΞ(t)]

+∑Ss=1 Ds(t)E

[ϕs+M(t)− r

(bs)cs (t)|ΞΞΞ(t)

]

+∑Kk=1 Yk(t)E

[ϕk(t)− rk(t)|ΞΞΞ(t)

]. (48)

Here ∆(ΞΞΞ(t)) ≤ Π, where Π represents the R.H.S of (48), and Ψ is a finite constant

that satisfies Ψ ≥ 12 ∑K

k=1E[(

ak(t)− rk(t))2|ΞΞΞ(t)

]+ 1

2 ∑Kk=1E

[(ϕk(t)− rk(t)

)2|ΞΞΞ(t)]+

12 ∑S

s=1E[(

ϕs+M(t)− rcss (t)

)2|ΞΞΞ(t)], for all t and all possible ΞΞΞ(t). We apply the Lya-

punov drift-plus-penalty technique [77], where the solution of (46) is obtained by mini-

mizing the Lyapunov drift and a penalty from the objective function, i.e.,

min Π−νE[ f0(ϕϕϕ(t))].

Here, the parameter ν is chosen as a non-negative constant to control the optimal mini-

mization solution [77]. Since Ψ is finite, the problem is to minimize the below expres-

sion subject to the convex set hull, given by (49). Note that (49) is decoupled over

user association, user scheduling, and operation mode variables (2⋆), auxiliary variables

(3⋆), and precoder and power allocation variables (1⋆), respectively as in (49). Hence,

the respective variables can be found independently by minimizing the individual term

at each time. Fig. 5 summarizes the relationship among the various subproblems.

53

l(t)?;φ(t)?

Q(t);D(t);Y(t)

V = UT

p(t)(b0)?

Q(t+ 1);D(t+ 1);Y(t+ 1)

Load Balancing

& Operation Mode

Beamforming Design

Power Allocation

Queue Update

Au

xil

iary

Vari

ab

le

Sele

cti

on

pk(t)(b0) = P (b0)=K

'(t)?

Tim

eIn

dic

es

Alg

ori

thm

1

Queue

Update

Pow

erA

llo

cati

on

DL

Tra

nsm

issi

on

MBS

SCs

SUEs

MUEs

CSI report

DL

Tra

nsm

issi

on

Fig. 5. Joint load balancing and interference mitigation algorithm ([23] c©2017 IEEE).

3.4.1 Joint load balancing and operation mode selection

First, the problem of joint load balancing and FD-enabled SC operation mode selection

in (2⋆) is cast as the minimization problem below.

minl,φφφ

−∑Kk=1 Ak(t) log

(1+ l

(b0)k (t)

p(b0)k

(1−τ2k )

σ 2+∑Ss=1 φ (bs)k

(bs)k

)

−∑Ss=1 Ds(t) log

(1+φ (bs)(t)

l(bs)cs (t)p

(bs)cs

σ 2+∑Ss′ 6=s φ (bs′ )k

(bs′ )cs′

)(50a)

subject to l(bs)j (t) ∈ 0,1,∀ j ∈ K ∪C , ∀ bs ∈ B, (50b)

φ (bs)(t) ∈ 0,1,Ntxs (t)≤ Nau

s ,∀ s ∈ S , (50c)

∑Ss=0 l

(bs)m (t)≤ 1,∀ m ∈ M ,

∑Mm=1 l

(bs)m (t)+ l

(bs)cs (t) = Ntx

s (t), (50d)

(39),rk(t) ∈ R ,

∑Kk=1 l

(b0)k (t)+∑S

s=1 Ntxs (t)≤ N, (50e)

∑Kk=1 ∑S

s=1 l(b0)k (t)φ (bs)(t)k

(bs)k (t)≤ ko, (50f)

where Ak(t) = Qk(t)+Yk(t). This problem is a non-convex program with binary vari-

ables. It turns out this problem has a hidden convexity structure and the non-convex

terms can be iteratively approximated by its convex upper bound via an iterative SCA

method. The motivations for utilizing the SCA method are due to (i) its low complexity

54

and fast convergence [85, Lemma 3.5] and (ii) the obtained solution which yields many

relaxed variables are close to zero or one [86]. In this regard, we convexify this problem

to find a sub optimal solution. First, we relax the binary constraints (50b) and (50c) to

linear constraints as continuous variables. Secondly, at each iteration i the non-convex

constraint (50f) is approximated by upper convex approximation, i.e.,

K

∑k=1

S

∑s=1

(δ(i)ks (l

(b0)k (t))2

2+

(φ (bs))2(t)

2δ(i)ks

)k(bs)k (t)−ko ≤ 0,

for every fixed positive value δ(i)ks . Finally, instead of minimizing the non-convex objec-

tive function (50a) we convert it into a convex function by the followings. We minimize

its upper bound by replacing the denominators, i.e., σ2+∑Ss=1 φ (bs)k

(bs)m with the largest

bound, i.e., σ2+k0. Due to the interference constraint (50f), we obtain the upper bound

as below

−K

∑k=1

Ak(t) log(1+

l(b0)k (t)p

(b0)k (1− τ2

k )

σ2 +k0

)

−S

∑s=1

Ds(t) log(1+φ (bs)(t)

l(bs)cs (t)p

(bs)cs

σ2 +k0

).

Using a similar approach to convexifying the interference constraint (50f), we convexify

the second part of the objective function which still remains non-convex. We denote the

lower bound of the SINR of UE served SC bs as γbs(t). Let us set l(bs)cs (t),

l(bs)cs (t)p

(bs)u

σ 2+k0.

Then we have:

γbs(t)≤ φ (bs)(t)l(bs)cs (t), ∀ s ∈ S , (51)

by introducing the new slack variable ρ2s (t), (51) is equivalent to:

1

4

(φ (bs)(t)− l

(bs)cs (t)

)2+ρ2

s (t)≤1

4

(φ (bs)(t)+ l

(bs)cs (t)

)2, (52)

and γbs(t)≤ ρ2s (t), ∀ s ∈ S . (53)

where the constraint (52) holds a form of the second-order cone inequalities (SOC),

while the RHS of the set of constraints in (53) are still non-convex, which can be ap-

proximated by using the iterative SCA method [85]. We rewrite the constraint (53)

as

γbs(t)≤ ρ(i)2s (t)+ 2ρ

(i)s (t)(ρs(t)− ρ

(i)s (t)), ∀ s ∈ S , (54)

55

Algorithm 3.1 Joint load balancing and operation mode algorithm ([23] c©2017 IEEE)

1: Initialization i := 0, δ(i)ks , ρ

(i)s := randomly positive that satisfy all constraints.

2: repeat

3: Solve (55) with δ(i)ks , ρ

(i)s to get optimal value ΛΛΛo⋆ = l⋆,φφφ⋆.

4: Update ΛΛΛo(i) := ΛΛΛo⋆ and δ(i+1)ks := φ (bs)(i)

l(b0)(i)k

; ρ(i+1)s := ρ

(i)s ; i := i+ 1.

5: until Convergence

where at iteration i+ 1, we update ρ(i+1)s (t) such that ρ

(i+1)s (t) = ρ

(i)s (t). Hence, the

optimal value of ΛΛΛo is given by

minl,φφφ

−∑Kk=1 Ak(t) log

(1+ l

(b0)k (t)

p(b0)k

(1−τ2k )

σ 2+k0

)−∑S

s=1 Ds(t) log(1+ γbs(t)

)

(55a)

subject to l(bs)j (t) ∈ [0,1],∀ j ∈ K ∪C ,∀ bs ∈ B, (55b)

φ (bs)(t) ∈ [0,1],Ntxs (t)≤ Nau

s ,∀ s ∈ S , (55c)

(50d),(50e),(52),(54), (55d)

∑Kk=1 ∑S

s=1

(δ(i)ks

(l(b0)k

(t))2

2+ (φ (bs))2(t)

2δ(i)ks

)

k(bs)k (t)−ko ≤ 0. (55e)

At each time slot t, the approximated problem (55) is iteratively solved as in Algo-

rithm 3.1. We numerically observe that the SCA-based Algorithm 3.1 converges quickly

within a few iterations and yields a continuous relaxation solution of many user asso-

ciation and operation mode variables close or equal to binary. To ensure that all the

users will be served, when performing Algorithm 3.1, to find the best scheduled users

each user is assumed to receive the same transmit power. Moreover, the scheduling will

be performed in a long-term period, while the power allocation problem is executed

in a short-term period. Since the objective function of the problem (55) is a maxi-

mum weighted matching problem with respect to linear or square function, we use a

low-complexity binary search algorithm [87] to obtain the final solutions with lower

dimensions. Let K1 = j,s|l(bs)⋆j ,φ (bs)⋆ = 1, Kuct = j,s|ξ ≤ l

(bs)⋆j ,φ (bs)⋆ ≤ 1, and

K0 = j,s|l(bs)⋆j ,φ (bs)⋆ ≤ ξ denote the set of selected variables, the set of uncertain

variables, the set of removed variables, respectively, where ξ is some small threshold.

First, we determine the set K1, Kuct, and K0 based on ξ . Then, we consider to select

among the uncertain variables in Kuct. By sorting Kuct in a descending order, a loop

starts by selecting one by one variable based on their largest weights according to the

56

objective function. We set the value uncertain variable to 1, and add it to K1, if it sat-

isfies the antennas (55d) and interference (55e) constraints. If it does not satisfy the

constraints, we add it to K0. The loop stops when it reaches the last uncertain variable

or the antennas constraint is over. Finally, K1 is kept, while K0 and Kuct are removed.

3.4.2 Auxiliary variable optimization

The optimal auxiliary variable from (3⋆) is computed by

minϕϕϕ(t)

∑Kk=1 Yk(t)ϕk(t)+∑S

s=1 Ds(t)ϕs+M(t) (56a)

−ν ∑Kk=1 ωk(t) f (ϕk(t)) (56b)

subject to ϕk(t)≤ amaxk (t). (56c)

Since the above optimization problem is convex, let ϕ∗k (t) be the optimal solution ob-

tained by the first order derivative of the objective function of (56). With a logarithmic

utility function, we have:

ϕ∗k (t) =

νωk(t)Yk(t)

if k ≤ M,

νωk(t)Yk(t)+Dk−M(t) otherwise.

The optimal auxiliary variable is minϕ∗k (t),a

maxk (t).

3.4.3 Interference mitigation and power allocation

For given scheduled users, the precoder U is found by solving (31). Finally, prob-

lem (46) is decomposed to find the transmit power p(b0)k (t) from (1⋆) that is minimized:

minp(t)

−∑Kk=1 Ak(t)rk(p(t)) (57a)

subject to1

N∑K

k=1

p(b0)k

(t)

Ωk(t)−P(b0) ≤ 0,

p(b0)k (t)≥ 0,∀ k ∈ K .

The objective function (57) is rewritten as n(p(t))=−∑Kk=1 Ak(t) log

(1+ p

(b0)k (t)nk(t)

),

where nk(t) =l(b0)k

(t)(1−τ2k )

σ 2+∑Ss=1 φ (bs)(t)k

(bs)k (t)

. The objective function is strictly convex for

57

p(b0)k (t) ≥ 0,∀k ∈ K , and the constraints are compact. Hence, the optimal solution of

p⋆(t) exists, the Lagrangian function is written as L(p(t),µ0) = n(p(t)) + µ0g(p(t)),

where µ0 ≥ 0 is the KKT multiplier. The KKT conditions are

∇n(p(t))T + µ01N ∑K

k=11

Ωk(t)= 0. (58)

µ0

(1N ∑K

k=1

p(b0)k

(t)

Ωk(t)−P(b0)

)

= 0. (59)

1

N∑K

k=1

p(b0)k

(t)

Ωk(t)−P(b0) ≤ 0, −p(t)≤ 0,µ0 ≥ 0. (60)

Here, ∇n(p(t))T =(n′(p(b0)1 (t)), . . . ,n′(p

(b0)K (t))) where n′(p

(b0)k (t))= −Ak(t)nk(t)

1+p(b0)k

(t)nk(t). For

µ0 6= 0, from (58), obtaining

p(b0)k (t) = max[

AkNΩk(t)

µ0

− 1

nk(t),0], (61)

from (59) and (61) we derive µ0. Finally, the optimal value of pk(t)(b0)⋆ is obtained

with (61).

3.4.4 Queue update

Update the virtual queues Yk(t) and Ds(t) according to (47) and (45), and the actual

queue Qk(t) in (44).

Theorem 3.2 is provided to show the performance analysis of the network utility

maximization based on Lyapunov framework and to prove that the queues are stable.

Theorem 3.2. [Optimality] Assume that all queues are initially empty. For arbitrary

arrival rates, the operation mode and load balancing are chosen to satisfy (49) and

the rate regime. For a given constant χ ≥ 0, the network utility maximization with any

ν > 0 provides the following utility performance with χ − approximation

f0 ≥ f ∗0 − Ψ+ χ

ν,

where f ⋆0 is the optimal network utility over the rate regime. While the strong stability

of the virtual queues and the network queues is given by

Qk(t)≤ νωk(t)πk + 2amaxk , ∀t ≥ 0, ∀k ∈ K ,

Yk(t)≤ νωk(t)πk + amaxk , ∀t ≥ 0, ∀k ∈ K ,

Ds(t)≤ νωs+M(t)πs+M + amaxs+M, ∀t ≥ 0, ∀s ∈ K .

Proof: The proof can be found in [77] and is omitted for the sake of brevity.

58

3.5 Numerical results

In this section Monte Carlo simulations are carried out in order to evaluate the system

performance of our proposed algorithm. To solve Algorithm 3.1, we use the YALMIP

toolbox [88] to model the optimization problem with SDPT3 [89] or MOSEK [90] as an

internal solver. For the simulation, we consider the proportional fairness utility function,

i.e., f (rk) = log(10−4 + rk) [91]. We denote our proposed user association algorithms

for the HetNet (resp. Homogeneous network) as HetNet-Hybrid (resp. HomNet [41]).

Here, HomNet [41] refers to when the MBS serves both MUEs and SUEs without

SCs. We compare our proposed algorithm with HomNet [41] and with the previous

work [74] (HetNet-Closed Access [74]). The HetNet-Closed Access [74] case consid-

ers only a joint in-band scheduling and interference mitigation algorithm with a fixed

user association scheme (SCs are configured in closed subscriber group). The network

performance is evaluated under the impact of the number of SCs per km2, the number

of MBS antennas N, and the MBS transmit power levels P(b0) at low and high frequency

bands. We provide the convergence behaviour of the proposed method and validation

of the approximation method.

3.5.1 Simulation environments

Consider a HetNet scenario, where an MBS is located at the center of a square area,

MUEs are randomly deployed within the coverage of the MBS (the minimum MBS-

MUE distance is 35 m). The SCs are uniformly distributed and one SUE per each SC

is considered. The number of antennas at SCs Ns is greater than two, while we as-

sume each SC can serve up to Naus = 2 UEs (including its own SUE). The path loss is

modeled as a distance-based path loss with line-of-sight (LOS) model for urban envi-

ronments [49, 92, 93]. To make the performance evaluation we first assume that the

probability of obtaining LOS is very high, while the effect of other channel models is

studied later. The FD interference threshold ko is set to 5× 10−3 and the RZF param-

eter is ζ = 10−2. The data arrivals follow the Poisson distribution with a mean rate of

1 Gbps, 100 Mbps, and 20 Mbps for 28 GHz, 10 GHz, and 2.4 GHz, respectively. The

parameter settings are summarized in Table 1.

59

Table 1. Parameter settings ([23] c©2017 IEEE)

Parameter Value

Maximum transmit power of MBS P(b0) 41 dBm

Maximum transmit power of SC 30 dBm

Channel quality τ 0.1

RZF parameter ζ 10−2

FD interference threshold ko 5× 10−3

SC antenna gain 5 dBi

Number of antennas at SC Ns + 1

Lyapunov parameter ν 2× 106

Path loss model

LOS @ 28 GHz 61.4+ 20log(d), bandwidth:1 GHz

LOS @ 10 GHz 55.25+ 18.5log(d), bandwidth:100 MHz

LOS @ 2.4 GHz 17+ 37.6log(d), bandwidth:20 MHz

3.5.2 Ultra-dense small cells environment

To show the impact of the network density, the average UE throughput (avgUT) and the

cell-edge UE throughput (cell-edge UT) as a function of the number of SCs are shown

in Fig. 6 and Fig. 7, respectively. The maximum transmit power of the MBS and SCs is

set to 41 dBm and 32 dBm, respectively. In Fig. 6 and Fig. 7, the simulation is carried

out in an asymptotic regime where the number of BS antennas and the network size

(MUEs and SCs) grow large with a fixed ratio [43]. In particular, the number of SCs

and the number of SUEs are both increased from 36 to 1000 per km2, while the number

of MUEs is scaled up with the number of SCs, such that M = 1.5× S. Moreover, the

number of the transmit antennas at MBS and SCs is set to N = 2×K and Ns = 6, respec-

tively. We recall that when adding SCs we also add one SUE per one SC that increases

the network load. Here, the total number of users is increased while the maximum

transmit power is fixed, and thus, the per-user transmit power is reduced by 1/K, which

reduces the per-UE throughput. Even though the number of MBS antennas is increased

by K, the performance of the massive MIMO system reaches its limit as the number

of antennas goes to infinity. It can be seen that with an increasing network load, our

proposed algorithm HetNet-Hybrid outperforms baselines (with respect to the avgUT

60

36100 200 300 400 500 600 700 800 900 1000

Number of Small Cells per km2

0

0.3

0.6

0.9

1.2A

chie

vab

le a

vgU

T [

Gbps]

28 GHz

HetNet-Hybrid

HetNet-Closed Access

HomNet

Fig. 6. Achievable avgUT versus number of the small cells per km2, S, when scaling accord-

ing to K = 2.5×S, N = 2×K ([23] c©2017 IEEE).

and the cell-edge UT) and the performance gap of the cell-edge UT is largest (5.6×)

when the number of SC per km2 is 350, and it is small when the number of SCs per km2

is too small or too large. The reason for this is that when the number of SCs per km2 is

too small, the probability of an MUE to find a open access nearby-SC to connect is low.

By increasing the number of SCs per km2 MUEs are more likely to connect with open

access nearby-SCs to increase the cell-edge UT. However, when the number of SCs per

km2 is too large, the cell-edge UT performance of HetNet-Hybrid is close to that of

HetNet-Closed Access [74] due to the increased FD interference. Moreover, Fig. 6 and

Fig. 7 show that the combination of massive MIMO and FD-enabled SCs improves the

network performance; for instance, HetNet-Hybrid and HetNet-Closed Access [74]

outperform HomNet [41] in terms of both the avgUT and the cell-edge UT. Our results

provide good insights for network deployment: for a given target UE throughput, what

is the optimal number of UEs to schedule and what is the optimal/maximum number of

SCs to be deployed?

61

36100 200 300 400 500 600 700 800 900 1000

Number of small cells per km2

0

0.1

0.2

0.3

0.4

0.5

Ach

ievab

le c

ell-

edge

UT

[G

bps]

28 GHz

HetNet-Hybrid


HomNet

Fig. 7. Achievable cell-edge UT versus number of small cells per km2, S, when scaling

K = 2.5×S, N = 2×K ([23] c©2017 IEEE).

3.5.3 Wireless backhaul impact for different transmit power levels

We also report the avgUT and the total network utility (TNU) along with the average

queue length (“dashed line") as a function of the MBS maximum transmit power at

different frequency bands (28 GHz, 10 GHz, and 2.4 GHz) in Fig. 8 and Fig. 9, re-

spectively. In particular we consider the number of SCs to be S = 45 per km2, and

the number of MUEs M to be twice the number of SCs S. The number of MBS anten-

nas is set to N = K, while the number of antennas at SCs Ns + 1 is set to 5. Due to

the insufficient number of antennas at the MBS to simultaneously serve all the MUEs

and SCs and to alleviate the interference, offloading from the MBS to SCs helps to

associate more UEs to the BSs. In this case the TNU is low, since the number of the

MBS antennas is reduced by half as compared to the impact of MBS antennas cases.

As decreasing the maximum transmit power at the MBSs, HetNet-Hybrid outper-

forms HetNet-Closed Access [74], there is an inflexion point where the performance of

HetNet-Hybrid is close to that of HetNet-Closed Access [74] when the transmit power

level is 25 dBm, 31 dBm, and 37 dBm at 28 GHz, 10 GHz, and 2.4 GHz, respectively. It

can be observed that at higher frequency bands FD-enabled SCs work better in the open

access mode than closed access mode under the same transmit power budget. When the

62

43 40 37 34 31 28 25 22

P(b

0) [dBm]

0

0.1

0.2

0.3

0.4

0.5

0.6

Ach

ievab

le a

vgU

T [

Gbps]

28 GHz

HetNet-Hybrid


HomNet

43 40 37 34 31 28 25 220

5

[Mb

ps]

10 GHz

43 40 37 34 31 280

0.5

[Mb

ps]

2.4 GHz

Fig. 8. Achievable avgUT versus P(b0) at 28, 10, and 2.4 GHz, when S = 45 per km2, K = 3×S,

N = K ([23] c©2017 IEEE).

43 40 37 34 31 28 25 22

P(b

0) [dBm]

0

1

2

3

4

5

6

To

tal

net

wo

rk u

tili

ty [

Gb

ps]

28 GHz

HetNet-Hybrid


HomNet

0

50

0

10

20

30

40

50

60

Av

erag

e Q

ueu

e L

eng

th [

Gb

]

0

50

43 40 37 34 31 28 25 220

50

100

[Mb

ps]

10 GHz

4

5

6

4

6

4

6

43 40 37 34 31 280

5

[Mb

ps]

2.4 GHz

Fig. 9. The TNU (“solid line") and network queue length (“dashed line") versus P(b0) at 28, 10,

and 2.4 GHz, when S = 45 per km2, K = 3×S, N = K ([23] c©2017 IEEE).

maximum MBS transmit power is too small, the performance of HetNet-Hybrid and

HetNet-Closed Access [74] is very closed to that of HomNet [41].

63

1 2 3 4 5 6 7 8 9 10

Number of iterations

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Cu

mu

lati

ve

per

cen

t

Cumulative distribution of number of iterations

Algorithm I : HetNet-Hybird

Fig. 10. The CDF of the convergence of Algorithm 3.1 ( [23] c©2017 IEEE).

3.5.4 Convergence

In Fig. 10 we show the convergence behaviour of our approximated algorithm based

on the SCA method when deploying our HetNet-Hybrid algorithm. The convergence

analysis is provided in Appendix 1.1. Unlike other works, we plot the cumulative distri-

bution of the number of iterations at which the Algorithm 3.1 converges for all t. We

observe that the probability that the number of iterations takes on a value less than or

equal to 4 is 90%, which implies that our proposed algorithm only needs few iterations

to converge.

We then validate the accuracy of the closed-form expression for the data rate by

comparing the Ergodic sum rate R, which is obtained by using the SINR from (35)

and (36) from simulations of i.i.d. Rayleigh block-fading channels, to the approximated

sum rate R, which obtained by using SINR from (41) and (42). The sum rate is defined

as the total sum of all user data rates. We define the absolute error as R−RR

, then we plot

the absolute error versus the number of MBS antennas, while the number of users is

fixed to K = 12. As can be seen in Fig. 11, the absolute error decreases as increasing

number of MBS antennas. This means that closed-form expressions are more accurate

when number of the MBS antennas is higher than number of users, i.e., N ≫ K. The

impact of the Lyapunov parameter ν on the achievable average network utility and

queue backlog has been shown in our previous work [74]. It has been observed that

64

15 30 45 60 75 90 105

N

0

0.005

0.01

0.015

Ab

solu

te E

rro

r m = 0.1

Absolute Error of the sum rate approximation

compared to the Ergodic sum rate

Fig. 11. Validation of the approximation of the closed-form expression, when K = 12 and

K = 3×S ( [23] c©2017 IEEE).

the network utility increases with O(1/ν), while the network backlog linearly increases

with O(ν). Hence, choosing the value of ν will result in an [O(1/ν),O(ν)] utility-queue

backlog tradeoff, which leads to a utility-latency tradeoff [72].

3.6 Summary and discussion

This chapter proposed an integrated access-backhaul (IAB) scheme in which a network

utility maximization problem was studied for solving the problem of joint load balanc-

ing and interference mitigation subject to the backhaul dynamic and network stability

in the presence of imperfect CSI. By using stochastic optimization, the studied prob-

lem was then decoupled into dynamic scheduling of MUEs, backhaul provisioning of

in-band FD-enabled SCs, and offloading UEs to in-band FD-enabled SCs as a function

of interference, number of antennas, and backhaul loads. Via numerical results, the

findings demonstrate that even at lower frequency band the performance of open access

small cells is close to that of closed access at some operating points, the open access

full-duplex small cell still yields higher gain as compared to the closed access at higher

frequency bands. Moreover, the open access full-duplex small cells outperform and

achieve 5.6× gain in terms of cell-edge performance compared to the closed access

ones in ultra-dense networks with 350 small cell base stations per km2.

65

This chapter made some ideal assumptions about the perfect SIC, perfect channel reci-

procity, and static UEs during each coherence time. In addition, when deploying mas-

sive antennas, a general analytical mmWave channel model was adopted, so a more-

specific mmWave channel model should be considered in future work. Moreover, hy-

brid beamforming should be taken into account to reduce the number of RF chains and

hardware complexity.

Furthermore, this chapter considered a two-hop transmission scheme, and the problem

of multi-hop multi-path transmission should be investigated as the network becomes

denser and the communication range is shorter in mmWave environments. In this regard,

the next chapter will focus on the path selection and rate allocation over multi-hop

mmWave transmissions.

66

4 Self-backhauled multi-hop architecture

In this chapter, the author extends the studied scenario in chapter 3 to self-backhauled

multi-hop transmissions and answers the research question Q2. In particular, the author

proposes a new system design which exploits multi-hop transmission, multiple antenna

diversity, mmWave bandwidth, and dynamic PS with traffic splitting techniques to over-

come the severe path loss and mitigate the impact of blockages in mmWave networks.


The main contributions of our work are listed as follows:

– A joint PS and RA optimization framework for multi-hop multi-path scheduling is

formulated, whereby self-backhauled FD SCs act as relay nodes to forward data from

the macro BS to the intended UEs. A multi-hop transmission technique enables reli-

able mmWave communications over a long distance. However, there is a probability

that the mmWave signal could be blocked by the human body. Hence, we also intro-

duce a multi-path selection scheme in which the transmitter smartly selects a subset

of the best paths from the possible paths.

– In the proposed system design, leveraging a massive array antenna, hybrid beam-

forming is adopted to provide a Gbps data rate at mmWave bands. In addition, we

impose a probabilistic latency bound to ensure URLLC with a high data rate. For

this purpose, the studied problem is cast as a network utility maximization (NUM),

subject to a bounded latency constraint and network stability.

– Leveraging a stochastic optimization framework [77], we decouple the studied prob-

lem into two sub-problems, namely PS and RA. By utilizing the benefits of historical

information, a reinforcement learning (RL) is used to build an empirical distribution

of the system dynamics to aid in learning the best paths to solve PS [80, 94]. Therein,

the concept of regret strategy is employed, defined as the difference between the av-

erage utility when choosing the same paths in previous times, and its average utility

obtained by constantly selecting different paths [80, 94]. The premise is that regret

is minimized over time so as to choose the best paths. Second, to solve a non-convex

RA sub-problem, we apply the concept of successive convex approximation (SCA)

method due to its low complexity and fast convergence [85].

67

– The proposed approach addresses the following fundamental questions: (i) over

which paths should the traffic flow be forwarded? and (ii) what is the data rate

per flow/sub-flow?, while ensuring a probabilistic latency constraint, and network

stability. By using a mathematical analysis, a comprehensive performance of our

proposed stochastic optimization framework is scrutinized. It is shown that there ex-

ists an [O(1/ν),O(ν)] utility-queue backlog trade-off, which leads to utility-latency

balancing [77], where ν is a control parameter. In addition, a convergence analysis

of both the two sub-problems is studied. Finally, the performance of the proposed

solution is validated in an extensive set of simulations.

Related work

A tractable rate model was proposed to characterize the rate distribution in self-backhauled

mmWave networks [95]. Few efforts have been made to study the mmWave network

operation regime, noise-limited or interference limited, depending on the density of in-

terferers, transmission strategies, or channel propagation models [96, 97, 98]. A large

body of research work has attempted to study the joint RA, congestion control, routing,

and scheduling for multi-hop wireless networks, incorporating the proportional latency

based on the sum of queue backlogs [99], applying the concept of back-pressure algo-

rithm [100, 101], exploiting the potential of multiple gateways [102].

The authors in [103] considered a problem of joint scheduling and congestion control

in a multi-hop mmWave network using a NUM framework in which the proposed solu-

tion is verified under three interference models, namely graph-based actual interference,

free-interference (IF), and the worse-case interference. [103] also showed that the IF

model provides very tight upper bound for a realistic system evaluation in mmWave

cellular networks as long as the optimal throughput can be guaranteed. However, [103]

was concerned only with the network capacity maximization and single path streaming,

a tight latency and reliable constraint should be investigated together with dynamic path

diversity.

Moreover, the authors in [104] designed a multi-hop wireless backhaul scheme with

latency guarantee in which a link activation scheme was proposed to avoid interference

and minimize the latency. A rate allocation problem to minimize the application layer

video/end-to-end distortion subject to quality of service constraints (latency, backhaul)

was considered in [105, 106] for multi-path networks. However, other important as-

pects in 5G networks such as low-latency and high-reliability are generally ignored

68

when maximizing the network performance (capacity, energy efficiency and spectral

efficiency) [39, 95, 107].

A recent work in [108] has studied the multi-hop relaying transmission challenges for

mmWave systems, aiming at maximizing overall network throughput, and taking ac-

count of traffic dynamics and link qualities. In our work, we also study the NUM

optimization problem, while considering channel variations and network dynamics. An-

other recent work in [109] has addressed the problem of traffic allocation for multi-hop

scheduling in mmWave networks to minimize the end-to-end latency, in which the mini-

mum latency is derived based on the channel capacity to determine the portions of traffic

over channels such that all traffic fractions arrive simultaneously at the destination.

In addition, the problem of PS and multi-path congestion control for data transfers was

studied in [110] in which the aggregate utility is increased as more paths are provided.

One important suggestion is to re-select randomly from the set of paths and shift be-

tween paths with higher payoff. However, splitting data into too many paths leads to

increased signaling overhead and causes traffic congestion. While interesting, the pre-

ceding works do not address the problem of high-data rate, low-latency and reliability

communication in multi-path mmWave networks. In this respect, our proposed solu-

tion is to select the best paths to maximize the network throughput, subject to a latency

bound violation constraint with a tolerable probability (reliability). Our previous work

[72] studied URLLC-centric mmWave networks for single hop transmission, and [23]

proposed an integrated access and backhaul architecture for two-hop relay without con-

sidering the latency-sensitive constraint. Hence, in this work we extend to the multi-hop

wireless backhaul scenario, and study a joint PS and RA problem focusing on URLLC.

Via mathematical analyses and extensive simulations, we provide insights into the per-

formance analysis of our proposed algorithm and the convergence characteristics of the

learning algorithm and the SOCP based iterative method.

The rest of the chapter is organized as follows. Section 4.2 describes the system

model and Section 4.3 provides the problem formulation for a joint PS and RA opti-

mization. Section 4.4 introduces a stochastic optimization framework to decouple our

studied problem, whereby two practical solutions are proposed. In Section 4.5, we

provide extensive numerical results to compare again other baselines. Conclusions are

drawn in Section 4.6.

69

Macro BS

Self-backhauled SCBS

UE 1

UE 2

Traffic aggregation

Route 1

Route 2

Route 4

Route 3

Traffic split

Full-duplex communication

UE K.....

UE k

One - hop transmission range

Fig. 12. Illustration of 5G multi-hop self-backhauled mmWave networks ([24] c©2019 IEEE).

4.2 System model

4.2.1 Network model

Let us consider a downlink (DL) transmission of a multi-hop heterogeneous cellu-

lar network (HCN) which consists of a macro base station (MBS), a set of B self-

backhauled small cell base stations (SCBSs), and a set K of K user equipments (UEs)

as shown in Fig 12. Let B = 0,1, · · · ,B denote the set of all BSs in which index 0

refers to the MBS. The in-band wireless backhaul is used to provide backhaul among

BSs [74]. A full-duplex (FD) transmission protocol is assumed at SCBS with perfect

self-interference cancellation (SIC) capabilities [111]. Each BS b is equipped with

Nb transmitting antennas and Rb radio frequency (RF) chains, such that 1 ≤ Rb ≤ Nb

[112, 113, 114]. Similarly, each UE k is equipped with Nk transmitting antennas and

Rk RF chains, such that 1 ≤ Rk ≤ Nk, Rk ≤ Rb, and Nk ≪ Nb. The network topology is

modeled as a directed graph G = (N , L), where N = B ∪K represents the set of nodes

including BSs and UEs. L = (i, j)|i ∈B, j ∈ N denotes the set of all directional edges

(i, j) in which nodes i and j are the transmitter and the receiver, respectively.

We consider a queuing network operating in discrete time t ∈ Z+. There are F inde-

pendent data flows at the MBS. Each data traffic is destined for only one UE, whereas

one UE can receive up to Rk multiple data streams, i.e., F ≥K. The number of total data

streams at the MBS is no greater than the number of RF chains, such that F ×Rk ≤ Rb

70

Table 2. Notations for system model ([75, 24] c©2019 IEEE).

Notations Descriptions

B,K Sets of (B+ 1) base stations, K user equipments

N = B ∪K Set of nodes including BSs and UEs

L Set of all directional edges (i, j)|i ∈ B, j ∈ N

F Set of F flows

Z f Set of Z f disjoint paths observed by flow f

Zmf Disjoint path state/table m observed by flow f

N(o)

i Set of the next hops from node i

i(I)f Previous hop of flow f to BS i

i(o)f Next hop of flow f from BS i

pf

(i, j) Transmit power of node i to node j for flow f

zmf = 1 Path m is used to send data for flow f

πmf Probability of choosing path m for flow f

[113, 114]. Hereafter, we refer to data traffic as data flow. We use F to represent the set

of F data flows/sub-flows. The MBS can split each flow f ∈ F into multiple sub-flows

which are delivered via disjoint paths and aggregated at UEs [115, 116].

We assume that there exits Z f number of disjoint paths from the MBS to the UE for

flow f . For any disjoint path m ∈

1, · · · ,Z f

, we denote Zm

f as the path state, which

contains all path information such as topology and queue states for every hop. Let

Z f = Z1f , · · · ,Zm

f , · · · ,ZZ f

f denote the path states/tables observed by flow f . We use

the flow-split indicator vector z f =(

z1f , · · · , z

Z f

f

)

to denote how the MBS splits flow f ,

where zmf = 1 means path m is used to send data for flow f ; otherwise, zm

f = 0. Let N(o)

i

denote the set of next hops from node i via a directional edge. We denote the next hop

and the previous hop of flow f from and to BS i as i(o)f and i

(I)f , respectively. Table 2

shows the notations, used throughout this chapter.

4.2.2 mmWave MIMO channel model

Due to limited spatial scattering in mmWave MIMO propagation [10, 114], we assume

that there are L(i, j) clusters between transmitter i and receiver j, such that L(i, j) ≪

71

r(i, j) = EH,p

wlog

1+p(i, j)|c†

(i, j)HT

(i, j)v(i, j)|2

∑i′ 6=i ∑j′∈N

(o)

i′p(i′, j′)|c†

(i, j)HT

(i′, j)v(i′, j)|2 +σ2j ‖c(i, j)‖2

.

(64)

min(Ni,N j). The channel matrix H(i, j) of link (i, j) can be modelled as [114, 117, 118]

H(i, j) =

√

Ni ×N j

L(i, j)

L(i, j)

∑l=1

h(i, j)(l)A j(α j,l)A†i (αi,l), (62)

where h(i, j)(l) denotes the small-scale fading coefficient of the cluster lth. α j,l and αi,l

denote the azimuth angles of arrival and departure, respectively. Here, Ai(αi,l) and

A j(αi,l) represent the transmitter and receiver response vectors, respectively (Please

refer [117, 118] for more details). We denote H =

H(i, j)|(i, j) ∈ L

as the network

channel matrix.

4.2.3 Transmission rate

We denote pf

(i, j)as the transmit power of node i assigned to node j for flow f , such that

∑ f∈F ∑j∈N

(o)i

pf

(i, j)≤ Pmax

i , where Pmaxi is the maximum transmit power of node i. We

have the following power constraint

P =

pf

(i, j)≥ 0, i, j ∈ N ,

∣∣∣ ∑

f∈F∑

j∈N(o)

i

pf

(i, j)≤ Pmax

i

. (63)

Vector p = (pf

(i, j)|∀i, j ∈ N ,∀ f ∈ F ) denotes the transmit power over all flows.

Based on the hybrid beamforming and combining model [113, 114], with c(i, j) ∈CN j×1 as the RF combining and baseband equalizer and v(i, j) ∈ CNi×1 as hybrid ana-

log/digital precoding, the Ergodic achievable rate5 r(i, j) at the receiver j from the trans-

mitter i can be calculated as (64). Here p(i, j) is the transmit power from the transmitter

i assigned to the receiver j, and the thermal noise of receiver j is η j ∼ CN(

0,σ2j

)

with

a variance of σ2j . In addition, w denotes the system bandwidth of the mmWave fre-

quency band.

5Note that we omit the beam search/tracking time, since it can be done fast and is negligible compared to the

transmission time [119]. Due to the disjoint path assumption and directional beamforming, the interference

associated to transmissions from transmitter i to other receivers j′, received at j, is assumed to be negligi-

ble or can be mitigated by designing the two-layer precoder at the transmitter i [23, 25]. For the sake of

simplification, the impact of this interference is left for future work.

72

r(i, j) = EH,p

wlog

1+

p(i, j)g(t)(i, j)

g(s)(i, j)

g(r)(i, j)

∑i′ 6=i ∑j′∈N

(o)

i′p(i′, j′)g

(t)(i′, j)g

(s)(i′, j)g

(r)(i′, j)+σ2

j

. (65)

As studied in [120], the previous works on mmWave hybrid beamforming are mainly

focused on the physical layer or signal processing aspects [112, 113, 114, 121]. The

authors in [120] developed an accurate analytical model that captures the essence of

mmWave hybrid beamforming, while tractable enough to analyze the throughput-delay

performance. In our work, we adopt the model in [120] to formulate the network utility

maximization subject to the congestion control and network stability. In particular, let

g(t)(i, j)

and g(r)(i, j)

denote the transmitter and receiver analog beamforming gain at the trans-

mitter i and the receiver j, respectively. In addition, we use ω(t)(i, j)

and ω(r)(i, j)

to represent

the angles deviating from the strongest path between the transmitter i and the receiver

j. Also, let θ(t)(i, j) and θ

(r)(i, j) denote the beamwidth at the transmitter i and the receiver

j, respectively. We adopt the widely used antenna radiation pattern [117, 120, 122] to

determine the beamforming gain as

g(i, j)(ω(i, j),θ(i, j)

)=

2π−(2π−θ(i, j))Γ

θ(i, j), if |ω(i, j)| ≤

θ(i, j)2

,

Γ, otherwise,

where 0 < Γ ≪ 1 is the side lobe gain. After the beam alignment is done, the receiver

sends the pilot sequences to the transmitter. The transmitter estimates the channel and

precodes signals, throughout this paper, the effective data rate of link (i, j) r(i, j) is calcu-

lated as (65) in which g(s)(i, j)

denotes the spatial channel gain of link (i, j) [117, 118, 122].

For a given channel state and transmit power, the data rate in the edge (i, j) over

flow f can be posed as a function of channel state and transmit power, i.e., r f(i, j) (H, p),

such that ∑ f∈F r f(i, j) = r(i, j). We denote r = (r f

(i, j)|∀i, j ∈ N ,∀ f ∈ F ) as a vector of

data rates over all flows.

Note that after the beam-searching and alignment are done [117, 122, 123, 124]

the receiver broadcasts pilot sequences to the transmitters, each transmitter estimates

the channel to the corresponding receiver and precodes transmit signal in the DL. With

multiple N j antennas and R j RF chains, each receiver is capable of receiving multiple

data streams from different transmitters using either the main beam or the side lope

beam. We assume that the traffic split and aggregation are done ideally, the multiple

data streams can be transmitted via different paths.

73

4.2.4 Network queues

Let Qif (t) denote the queue length at a BS i at time slot t for flow f . The queue length

evolution at the MBS i = 0 is

Qif (t + 1) =

[

Qif (t)−

Z f

∑m=1,i

(o)f ∈Zm

f

r f(i,i

(o)f )(t), 0

]++ a f (t), (66)

where a f (t) is the data arrival at the MBS during slot t, which is i.i.d. over time with a

mean value a f and is bounded by a f (t)≤ amaxf < ∞. Due to the disjoint paths, for each

flow f the incoming rate from the previous hop i(I)f at the SCBS i is either from another

SCBS or the MBS, and thus, the queue evolution at the SCBS i = 1, · · · , B is given

by

Qif (t + 1) =

[

Qif (t)− r f

(i,i(o)f )

(t), 0]+

+ r f(i(I)f ,i)

(t). (67)

4.3 Problem formulation

Assume that the MBS determines which paths to split data flow f with a given prob-

ability distribution, i.e., πππ f =(π1

f , · · · ,πZ f

f

), where for each m ∈ Z f we have πm

f =

Pr(

z f = zmf

)

. Here, πππ f is the probability mass function (PMF) of the flow-split vector,

i.e., ∑Z f

m=1 Pr(

zmf

)

= 1. We denote πππ =

πππ1, · · · ,πππ f , · · · ,πππF

∈ Π as the global prob-

ability distribution of all flow-split vectors in which Π is the set of all possible global

PMFs. Let x f denote the achievable average rate of flow f such that

x f , limt→∞

1

t

t−1

∑τ=0

x f (τ) ,andx f (τ) =

Z f

∑m=1,i

(o)f ∈Zm

f

EH,p

[πm

f r f(i,i

(o)f )

(τ)]∣∣∣i=0

.

We assume that the achievable rate is bounded, i.e.,

0 ≤ x f (t)≤ amaxf , (68)

where amaxf is the maximum achievable rate of flow f at every time t. Vector x =

(x1, · · · , xF) denotes the time average of rates over all flows. Let R denote the rate

region, which is defined as the convex hull of the average rates, i.e., x ∈ R .

We define U0 as the network utility function, i.e., U0 (x) = ∑ f∈F U(x f

)[110, 23].

Here, U(·) is assumed to be a twice differentiable, concave, and increasing L-Lipschitz

function for all x ≥ 0. According to Little’s law [125], the average queuing latency

74

is defined as the ratio of the queue length to the average arrival rate. By taking ac-

count of the probabilistic latency constraints for each flow/subflow, the network utility

maximization (NUM) is formulated as followsOP2: max

πππ ,x,pU0(x) (69a)

subject to Pr(Qi

f (t)

a f

≥ dth)

≤ ε ,∀t, f ∈ F , i ∈ B, (69b)

limt→∞

E

[

|Qif |]

t= 0,∀ f ∈ F ,∀i ∈ B, (69c)

x(t) ∈ R , (69d)

πππ ∈ Π, (69e)

and (63), (68),

where dth reflects the latency threshold required for UEs, and ε ≪ 1 is the target prob-

ability for reliable communication6. The probabilistic latency constraint (69b) implies

that the probability that the latency for each flow at node i is greater than dth is very

small, which captures the constraints of ultra-low latency and reliable communication

[72, 126]. It is also used to avoid congestion for each flow f at any point (BS) in the

network, since the queue length is ensured less than dtha f with probability 1−ε . Hence,

(69b) forces the transmission of all BSs without building large queues, and (69c) main-

tains network stability.

The above problem has a non-linear probabilistic constraint (69b), which cannot be

solved directly. Hence, we replace the non-linear constraint (69b) with a linear deter-

ministic equivalent by applying Markov’s inequality [127, 72] such that Pr(X ≥ x) ≤E [X ]/x for a non-negative random variable X and x > 0. Thus, we relax (69b) as

E[Qi

f (t)]≤ a f εdth. (70)

Assuming that a f (t) follows a Poisson arrival process [127], we derive the expected

queue length in (66) for i = 0 as

E[Qif (t)] = ta f −

t

∑τ=1

∑m=1,i

(o)f ∈Zm

f

πmf r f

(i,i(o)f )(τ), (71)

and the expected queue length in (67), for each SCBS, i.e.,

E[Qif (t)] =

t

∑τ=1

∑m

πmf

(

r f(i(I)f ,i)

(τ)− r f(i,i

(o)f )

(τ))

. (72)

6For the sake of simplicity, we assume that all UEs has same latency and reliability requirements

75

Subsequently, combining the constraints (70) and (71), we obtain the following linear

constraint (73) of instantaneous rate requirements, which helps to analyse and optimize

the URLLC problem [72, 126], for MBS i = 0,

a f (t − εdth)−t−1

∑τ=1

∑m=1,i

(o)f ∈Zm

f

πmf r f

(i,i(o)f )

(τ)≤ ∑m=1,i

(o)f ∈Zm

f

πmf r f

(i,i(o)f )

(t) . (73)

Similarly, for each SCBS i = 1, · · · ,B, we have

−a f εdth +t−1

∑τ=1

∑m

πmf

(

r f(i(I)f ,i)

(τ)− r f(i,i

(o)f )

(τ))

≤ ∑m

πmf

(

r f(i,i

(o)f )

(t)− r f(i(I)f ,i)

(t))

,

(74)

by combining (70) and (72). With the aid of the above derivations, we consider (73) and

(74) instead of (69b) in the original problem (69). In practice, the statistical informa-

tion of all candidate paths to decide πππ f ,∀ f ∈ F , is not available beforehand, and thus

solving (69) is challenging. One solution is that paths are randomly assigned to each

flow which does not guarantee optimality, whereas applying an exhaustive search is not

practical. Therefore, in this work, the Lyapunov stochastic optimization pertains to the

queuing network and characterizes the queuing latency in the presence of randomness

(mmWave wireless channels and arbitrary arrivals). As a result, (69) is decoupled into

sub-problems, which can be solved by low-complexity and efficient methods. In partic-

ular, RL is leveraged to find the best paths without requiring the statistic information,

and SCA method obtains a locally efficient solution for flow rate allocation.

4.4 Proposed path selection and rate allocation algorithm

In this section, we propose a Lyapunov optimization based framework to solve our pre-

defined problem (69) with relaxed latency constraints. To do that, we first introduce a

set of auxiliary variables to refine the original problem (69). Next, we convert the con-

straints into virtual queues and derive the conditional Lyapunov drift function. Finally,

the solution of the equivalent problem is obtained by minimizing the Lyapunov drift

and a penalty from the objective function. Let us start by rewriting (69) equivalently as

follows

RP2: maxϕϕϕ,πππ ,p

U0(ϕϕϕ) (75a)

subject to ϕ f − x f ≤ 0, ∀ f ∈ F , (75b)

(63), (68), (69c), (69e), (73), (74),

76

where the new constraint (75b) is introduced to replace the rate constraint (69d) with

new auxiliary variables ϕϕϕ = (ϕ1, · · · ,ϕF). In (75b), ϕϕϕ , limt→∞

1t ∑t−1

τ=0 E [|ϕϕϕ(τ)|]. In order

to ensure the inequality constraint (75b), we introduce a virtual queue vector Yf (t) ,

which is given by

Yf (t + 1) =[Yf (t)+ϕ f (t)− x f (t)

]+, ∀ f ∈ F . (76)

Let ΞΞΞ(t) = (Q(t), Y(t)) denote the queue backlogs. We first write the conditional Lya-

punov drift for slot t as

∆(ΞΞΞ(t)) = E

[

L(ΞΞΞ(t + 1))−L(ΞΞΞ(t)) |ΞΞΞ(t)]

, (77)

where L(

ΞΞΞ(t))

, 12

[

∑Ff=1 ∑B

i=0 Qif (t)

2 +∑Ff=1 Yf (t)

2]

is the quadratic Lyapunov func-

tion of ΞΞΞ(t) [77]. We apply the Lyapunov drift-plus-penalty technique [23, 77], at each

time slot t the solution of (75) is obtained by minimizing the Lyapunov drift and a

penalty from the objective function, i.e.,min ∆(ΞΞΞ(t))−νE [U0 (ϕϕϕ) |ΞΞΞ(t)] . (78)

Here, ν is a control parameter to trade off utility optimality and queue length [23, 77].

Moreover, the stability of ΞΞΞ(t) ensures that the constraints of problem (69c) and (75b)

are held. Noting that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive

number a,b, and thus, by neglecting other indexes t, f , . . ., we have:

(max [Q−R(o), 0]+R(I))2 −Q2 ≤ 2Q(R(I)−R(o))+ (R(I)−R(o))2,

(max [Q−R(o), 0]+ a)2−Q2 ≤ 2Q(a−R(o))+ (a−R(o))2,

max [Y +ϕ − x, 0]2 −Y 2 ≤ 2Y (ϕ − x)+ (ϕ − x)2.

Subsequently, following the calculations of the Lyapunov optimization [77], choosing

that ϕϕϕ ∈ R and a feasible π and all possible ΞΞΞ(t) for all t, we obtain

(78) ≤F

∑f=1

B

∑i=1

Qif E

[

∑m

πmf (r f

(i(I)f ,i)− r f

(i,i(o)f )

)|ΞΞΞ(t)]

−F

∑f=1

Qi|i=0

f E

[

∑m=1,i

(o)f ∈Zm

f

πmf r f

(i,i(o)f )|ΞΞΞ(t)

]

(79)

+F

∑f=1

E

[

Yf ϕ f −νU(ϕ f

)−Yf x f |ΞΞΞ(t)

]

+Ψ.

Here, Ψ is a finite constant that satisfies Ψ≥ 12 ∑F

f=1 ∑Bi=1 E

[

∑m πmf (r f

(i(I)f ,i)−r f

(i,i(o)f )

)2|ΞΞΞ(t)]+

12 ∑F

f=1 E[

∑m=1,i

(o)f ∈Zm

f

πmf (a f − r f

(i,i(o)f ))2|ΞΞΞ(t)

]+ 1

2 ∑Ff=1 E

[(ϕ f − x f )

2|ΞΞΞ(t)]

[77, 23].

77

The solution to (75) can be obtained by minimizing the upper bound in (79) without the

finite constant Ψ. For every slot t, observing ΞΞΞ(t), we have three decoupled subprob-

lems and provide the solutions for each subproblem as follows. The flow-split vector

and the probability distribution are determined by

SP1 : minπππ

F

∑f=1

ℵ f

subject to (69e),

whereℵ f =

B

∑i=1

Qif ∑

m

πmf

(

r f(i(I)f ,i)− r f

(i,i(o)f ))

−Qi|i=0

f ∑m=1,i

(o)f ∈Zm

f

πmf R

f

(i,i(o)f).

Then, we select the optimal auxiliary variables by solving

SP2: minϕϕϕ |πππ

F

∑f=1

[

Yf ϕ f −νU(ϕ f

)]

subject to ϕ f (t)≥ 0, ∀ f ∈ F .

Let ϕ∗f be the optimal solution obtained by the first order derivative of the objective func-

tion of SP2. Assuming a logarithmic utility function, we have ϕ∗f (t) = max

νYf, 0

.

Finally, the RA is done by assigning transmit power, which is obtained by

SP3: minx,p|πππ

F

∑f=1

−Yf x f

subject to (63), (68), (73), (74).

4.4.1 Path selection

Recall that z f represents the flow-split vector given to flow f and zmf = 1 when path

m is used to send data for flow f . The MBS selects paths for each flow with a given

probability (mixed strategy) [73, 80]. We denote umf = u f

(

zmf ,z

−mf

)

as a utility function

of flow f when using path m. The vector z−mf denotes the flow-split vector excluding

path m. The MBS can choose more than one path to deliver data, from SP1, the utility

gain of flow f is

u f = ∑m

umf =−ℵ f .

To exploit the historical information, the MBS determines a flow-split vector for each

flow f from Z f based on the PMF from the previous stage t − 1, i.e.,

π f (t − 1) =(

π1f (t − 1) , · · · ,πZ f

f (t − 1))

. (80)

78

Here, we define ΦΦΦ f (t) = (Φ1f (t) , · · · ,Φm

f (t) · · · ,ΦZ f

f (t)) as a regret vector of determin-

ing flow-split vector for flow f . The MBS selects the flow-split vector with highest

regret in which the mixed-strategy probability is given as

πmf (t) =

[

Φmf (t)

]+

∑m′∈Z f

[

Φm′f (t)

]+ . (81)

Let ΦΦΦ f (t) = (Φ1f (t) , · · · ,Φm

f (t) · · · ,ΦZ f

f (t)) be the estimated regret vector of flow f .

Basically, with the goal of maximizing the cumulative reward in SP1, the MBS (agent)

has to discover the possible paths (action set) in order to find the best paths (distribution

of actions with higher pay-off) in the long run [80]. If the MBS spends much time

on discovering paths (called exploration), it leads to longer convergence time. If the

MBS only exploits the action (called exploitation), which gave the highest pay-off at

the beginning, it may loose a chance to obtain higher reward later. Hence, balancing the

trade-off between exploration and exploitation is fundamental for efficient learning. For

this purpose, we have adopted the logit of Boltzmann-Gibbs (BG) kernel to efficiently

learn the best paths [80, 94], βββ mf

(

ΦΦΦ f (t))

, given by

βββ mf (ΦΦΦ f (t)) = argmax

πππ f ∈Π∑

m∈Z f

[πmf (t)Φ

mf (t)−κ f π

mf (t) ln(πm

f (t))], (82)

where the trade-off factor κ f is used to balance between exploration and exploitation

[128, 94, 129]. If κ f is small, the MBS selects z f with highest payoff. For κ f → ∞ all

decisions have equal probability.

For a given set of ΦΦΦ f (t) and κ f , we solve (82) to find the probability distribution in

which the solution determining the disjoint paths for each flow f is given as

β mf (ΦΦΦ f (t)) =

exp

(

1κ f

[

Φmf (t)

]+)

∑m′∈Z f

exp

(

1κ f

[

Φm′f (t)

]+) . (83)

We denote u(t) as the estimated utility of flow f at time instant t with action z f , i.e,

u f (t) = (u1f (t) , · · · , um

f (t) · · · , uZ f

f (t)). Upon receiving the feedback, u f (t) denotes the

utility observed by flow f , i.e., u f (t) = u f (t − 1), we propose the learning mechanism

at each time instant t as follows.

Learning procedure: The estimates of the utility, regret, and probability distribution

functions are performed, and are updated for all actions per path m as follows [73, 80]:

79

umf (t) = um

f (t − 1)+ ι (1)f (t)1z f =zm

f (

u f (t)− umf (t − 1)

)

Φmf (t) = Φm

f (t − 1)+ ι (2)f (t)

(

umf (t)− u f (t)− Φm

f (t − 1))

,

πmf (t) = πm

f (t − 1)+ ι (3)f (t)

(

β mf (ΦΦΦ f (t))−πm

f (t − 1))

,

(84)

Here, ι (1)f (t), ι (2)

f (t), and ι (3)f (t) are the learning rates [73, 80, 129]. Based on the prob-

ability distribution as per (84), the MBS determines the flow-split vector for each flow

f . Note that the learning-aided PS is performed in a long-term period to ensure that the

paths do not suddenly change such that the SCBSs have sufficient time to deliver traffic

from the queues. For instance, at the beginning of large time scale, the best paths are

selected, and will be used for the rest of these large time scale as shown in Fig. 13.

Here, we briefly establish the convergence conditions to the o-coarse correlated

equilibrium for the reinforcement learning based algorithm, where o is a very small

positive value [130]. The complete proof was studied in [94, 129], the learning rates

ι (1)f (t), ι (2)

f (t), and ι (3)f (t) are chosen to satisfy the convergence conditions as follows:

limt→∞

∑tτ=0 ι (1)

f (τ) = +∞, limt→∞

∑tτ=0 ι (2)

f (τ) = +∞,

limt→∞

∑tτ=0 ι (3)

f (τ) = +∞, limt→∞

∑tτ=0 ι (1)2

f (τ)<+∞,

limt→∞

∑tτ=0 ι (2)2

f (τ)<+∞, limt→∞

∑tτ=0 ι (3)2

f (τ)<+∞,

limt→∞

ι(3)f (t)

ι(2)f(t)

= 0, limt→∞

ι(2)f (t)

ι(1)f(t)

= 0.

4.4.2 Rate allocation

Consider r f(i, j) = log(1+ p

f

(i, j)|g(i, j)(h)|2) as the transmission rate, where the effective

channel gain7 for mmWave channels can be modeled as |g(i, j)(h)|2 =|g(i, j)(h)|2

1+Imax [23,

69]. Here, g(i, j)(h) and Imaxdenote the normalized channel gain and the maximum

interference, respectively. Denoting the left hand side (LHS) of (73) and (74) as Dfi for

simplicity, the optimal values of flow control x and transmit power p in the sub-problem

3 (SP3) are found by minimizing

7The effective channel gain captures the path loss, channel variations, and interference penalty (Here, the

impact of interference is considered small due to highly directional beamforming and high pathloss for inter-

fered signals at mmWave frequency band, and thus a multi-hop directional transmission can be operated at

dense mmWave networks [95, 96, 97, 98, 103]).

80

minx,p|πππ

F

∑f=1

−Yf x f (85a)

subject to 1+ pf

(i,i(o)f)|g

(i,i(o)f )

|2 ≥ ex f ,∀ f ∈ F , i = 0, (85b)

1+ pf

(i,i(o)f )

|g(i,i

(o)f )

|2

1+ pf

(i(I)f ,i)

|g(i(I)f,i)|2

≥ eDfi , f ∈ F ,∀i = 1 : B, (85c)

∑f∈F

pf

(i,i(o)f)≤ Pmax

i ,∀i ∈ B,∀ f ∈ F . (85d)

The constraint (85c) is non-convex, motivated by the low-complexity of SCA method,

we solve (85) by replacing (85c) with its proper convex approximation, but it is very

hard to find the convex approximation of (85c) [85, 131]. In this regard, we introduce

the slack variable y to transform (85c) into equivalent constraints, having a proper bound

satisfying the conditions in [85, Property A] as

2+ pf

(i,i(o)f )

|g(i,i

(o)f )

|2

2≥

√√√√

y2 +(

pf

(i,i(o)f )

|g(i,i

(o)f )

|2

2

)2

, (86)

y2

1+ pf

(i(I)f ,i)

|g(i(I)f ,i)

|2≥ eD

fi . (87)

Here, the constraint (86) holds a form of the second-order cone inequalities [131, 85,

132], while the LHS of constraint (87) is a quadratic-over-affine function which is itera-

tively replaced by the first order to achieve a convex approximation as follows [86, 133]

:

2yy(l)

1+ pf (l)

(i(I)f ,i)

|g(i(I)f ,i)

|2−

y(l)2(

1+ pf

(i(I)f ,i)

|g(i(I)f,i)|2)

(

1+ pf (l)

(i(I)f ,i)

|g(i(I)f ,i)

|2)2

≥ eDfi . (88)

Here, the superscript l denotes the lth iteration. Hence, we iteratively solve the approxi-

mated convex problem of (85) as Algorithm 4.1 in which the approximated problem is

given as

minx,p|πππ

F

∑f=1

−Yf x f (89)

subject to (85d), (68), (85b), (86), (88).

81

π(t− 1), Q(t− 1),Y(t− 1)

u(t), Φ(t);π(t)

z(t)

Learning in long-term period Rate allocation in short-term period

Regret learning based

path selection: SP1

Path distribution

estimation

DL transmission

Queue update

Auxiliary variable selection

SP2

Iterative rate allocation

SP3

'∗ p∗

Fig. 13. Information flow diagram of the learning-aided PS and RA approach ([75, 24] c©2019

IEEE).

Algorithm 4.1 Iterative RA

1: Initialization: set l = 0 and generate initial points y(l).

2: repeat

3: Solve (89) with y(l) to get the optimal value y(l)⋆.

4: Update y(l+1) := y(l)⋆; l := l + 1.


Finally, the information flow diagram of the learning-aided PS and RA approach

is shown in Fig. 13, where the RA is executed in a short-term period. Note that the

PS and RA are both done at the MBS, in this work we assume that the information is

shared among the base stations by using the X2 interface. As opposed to a brute-force

approach yielding the global optimal solution, the proposed iterative solution that uses

time scale separation remarkably reduces the search time and computational complexity,

while obtaining an efficient suboptimal solution8.

8Note that the problem of finding the global optimality is outside the scope of our study. The effectiveness of

SOCP method was verified in the literature and shown to be robust in practical scenarios [131].

82


In this section Monte Carlo simulations are carried out in order to evaluate the sys-

tem performance of our proposed algorithm. To solve Algorithm 1, we use YALMIP

toolbox to model the optimization problem with MOSEK as internal solver [88]. For

simulations, we assume that there are two flows from the MBS to two UEs, while the

number of available paths for each flow is four [110]. The MBS selects two paths from

four most popular paths9. Each path contains two relays, the total number of SCBSs is

8, and the one-hop distance is varying from 50 to 100 meters. The maximum transmit

power of MBS and each SC are 43 dBm and 30 dBm, respectively, and the SC antenna

gain is 5 dBi. The number of antennas Nb at each BS is set to 8 and 64 for small and

large antenna arrays, respectively. The number of antennas Nk at UE is set to 2 and 16,

for small and large antenna arrays, respectively. The number of RF chains at BS Rb and

UE Rk are set to 8 and 2, respectively.

For simulations purposes, the general channel model for arbitrary antenna arrays is

used. In particular, the estimate channel matrix H(i, j) ∈ CNi×N j of the channel matrix

H(i, j) ∈ CNi×N j between the transmitter i and the receiver j can be modeled as [25, 134]

H(i, j) =√

Ni ×N jΘΘΘ1/2

(i, j)

(√

1− τ2j W(i, j)+ τ jW(i, j)

)

,

where W(i, j) =[w1(i, j), · · · ,w

n j

(i, j), · · · ,wN j

(i, j)

]∈ CNi×N j is the small-scale fading channel

matrix, which is independent and identically distributed (i.i.d.) with zero mean and

variance 1Ni×N j

in which wn j

(i, j)∈CNi×1 is the small-scale fading channel vector between

the transmitter antenna array and the nthj antenna of receiver j. Here, τ j ∈ [0,1] reflects

the estimation accuracy for receiver j, if τ j = 0, then H(i, j) = H(i, j), the perfect channel

state information is assumed at the transmitters [135]. W(i, j) ∈ CNi×N j is the estimated

noise, also modeled as a realization of the circularly symmetric complex Gaussian distri-

bution matrix with zero mean and variance of 1Ni×N j

[23, 25]. Moreover, ΘΘΘ(i, j) ∈CNi×Ni

depicts the antenna spatial correlation matrix that accounts for the path loss and shadow

fading, such that Rank(Θ(i, j))≪ Ni.

We generate the spatial correlation matrix as Θ(i, j)=PL(i, j)Θ(i, j) with Rank(Θ(i, j))=

Ri, and the normalized spatial correlation matrix with Tr(Θ(i, j))=Ni [134]. The mmWave

path loss PL(i, j) is modeled as a distance-based path loss for urban environments at

28 GHz with a 1 GHz system bandwidth [136, 49], which may exist as a line-of-sight

9As studied in [110], it suffices for a flow to maintain at least two paths provided that it repeatedly selects

new paths at random and replaces if the latter provides higher throughput.

83

(LOS), non-LOS (NLOS), or blockage states. We adopt the mmWave channel model

used in the system level simulation in [136], given by

PL(d) = Pr(d)PLLOS(d)+ (1−Pr(d))PLNLOS(d),

where PLLOS(d) and PLNLOS(d) are the distance-based path loss for LOS and NLOS

states at distance d, respectively [136]. Here, Pr(d) denotes a boolean random variable

that is 1 with some probability. For the general blockage channel model, the LOS proba-

bility is defined as exp(−0.006d), then the NLOS probability is 1−exp(−0.006d) [136,

49]. For the analog beamforming, the side lobe gain Γ is set to 14, and the beamwidths

at the transmitter and receiver are set to π4

and π3

radians, respectively.

We assume that the traffic flow is divided equally into two sub-flows, the arrival rate

for each sub-flow is varying from 2 to 5 Gbps for small antenna array case. The maxi-

mum delay requirement β and the target reliability probability ε are set to be 10 ms and

5%, respectively [72]. For the learning algorithm, the Boltzmann temperature (trade-off

factor) κ f is set to 5, while the learning rates ι (1)(t), ι (2)(t), and ι (3)(t) are set to 1

(t+1)0.51 ,

1

(t+1)0.55 , and 1

(t+1)0.6 , respectively [129, 73]. The parameter settings are summarized in

Table 3.

To that end, we would like to notice that our work contains some main features: (i)

NUM [77, 101], (ii) dynamic path selection learning [94], and (iii) URLLC-aware rate

allocation [72]. We consider the following baselines: Baseline 1 employs features (i)

and (ii) , whereas Baseline 2 applies features (i) and (iii), finally Baseline 3 considers

only feature (i). We benchmark our work and these baselines to assess the impact of

the dynamic path selections and of the URLLC-constrained rate allocation, which has

not been addressed in the literature in the context of mmWave communications. In

addition, Single hop scheme considers that the MBS delivers data to UEs over one

single hop at long distance in which the probability of LOS communication is low, and

then the blockage needs to be taken into account [136].

4.5.1 Small antenna array system

We first evaluate the network performance under the small antenna array setting, i.e.,

Ni = 8, N j = 2. In Fig. 14, we report the average one-hop latency10 versus the mean

arrival rates µ . As we increase µ , baselines 3 , 222, and 111 violate the latency constraints

at µ = 3.5, 4.5, and 5 Gbps, respectively. While the average latency of our proposed

10The average end-to-end latency is defined as the sum of the average one-hop latency of all hops.

84

Table 3. Parameter settings ([24] c©2019 IEEE)

Parameter Value

B, K, F 8, 2, 4

Number of BS antennas Nb 8,64

Number of UE antennas Nk 2,16

Maximum latency β 10 ms

Target reliability ε 0.05,0.1,0.15

Boltzmann temperature 2,5,10,20,50

Path loss model

LOS @ 28 GHz 61.4+ 20log(d)dB

NLOS @ 28 GHz 72+ 29.2log(d)dB

System bandwidth 1 GHz

algorithm is gradually increased with µ , but under the warming level, β = 10 ms. The

reason is that the latency requirement is satisfied via the equivalent instantaneous rate

by our proposed algorithm as per (73) and (74), while the baselines 1 and 3 use the tradi-

tional utility-latency trade-off approach without considering the latency constraint, and

the baseline 2 considers the random PS mechanism only. The benefit of applying the

learning path algorithm is that selecting the path with high payoff and less congestion,

results in small latency. Let us now take a look at µ = 4.5 Gbps, the average one-hop la-

tency of baseline 1 with learning outperforms baselines 2 and 3, whereas our proposed

scheme reduces latency by 50.64%, 81.32% and 92.9% as compared to baselines 1, 2,

and 3, respectively. When µ = 5 Gbps, the average latency of all baselines increases

dramatically, violating the latency requirement of 10 ms, while our proposed scheme is

robust to the latency requirement.

In Fig. 25, we report the tail distribution (complementary cumulative distribution

function (CCDF)) of latency to showcase how often the system achieves a latency

greater than the target latency levels [137] as µ = 4.5 Gbps, ε = 5%, β = 10 ms. In

contrast to the average latency, the tail distribution is an important metric to reflect the

URLLC characteristic. For instance, at µ = 4.5 Gbps, by imposing the probabilistic

latency constraint, our proposed approach ensures reliable communication with bet-

ter guaranteed probability, i.e, Pr(latency > 10ms) < 10−6. In contrast, baseline 1

with learning violates the latency constraint with high probability, where Pr(latency >

85

2 2.5 3 3.5 4 4.5 5Mean arrival rate [Gbps]

0

5

10

15

20

25

30

Aver

age

one-

hop l

aten

cy [

ms]

Proposed Algorithm

Baseline 1

Baseline 2

Baseline 3

= 10 ms, = 0.05

Fig. 14. Average one-hop latency versus mean arrival rates ([75, 24] c©2019 IEEE).

10ms) = 0.08 and Pr(latency > 25ms) < 10−6, while the performance of baselines

2 and 3 gets worse. For instance, as shown in Fig. 25, baselines 2 and 3 obtain

Pr(latency > 10ms)> 0.12 and Pr(latency > 10ms) > 0.24, respectively. For through-

put comparison, we observe that for µ = 4.5 Gbps, our proposed algorithm is able to

deliver 4.4874 Gbps of average network throughput per each sub-flow, while the base-

lines 1, 2, and 3 deliver 4.4759, 4.4682, and 4.3866 Gbps, respectively. Here, the Single

hop scheme only delivers 3.55 Gbps due to the high path loss, causing large latency.

Note that in this work we mainly focus on the low latency scale, i.e., 1−10 ms, the

target achievable rate for all schemes is very high and close to each other. Hence, we

report the average MBS queue length instead of the average achievable rate. Generally

speaking, as per (66), the average achievable rate can be extracted from the average

MBS queue length and the mean arrival rate, i.e., x f = µ f − Q f . In Fig 16, we plot the

average queue length of the MBS as a function of mean arrival rates. As we increase

the mean arrival rate from 2 to 5 Gbps, the average MBS queue length of our proposed

algorithm is increased from 0.01 Gb to 0.04 Gb, which means that the average latency

at the MBS is increased from 5 ms to 8 ms, which meet the latency constraint (69b).

In contrast, the average queue length of the baselines is increased up to 16 ms, which

violates the latency constraint (69b).

86

0 5 10 15 20 25 30 35 40

One-hop latency [ms]

10-6

0.05

0.12

0.240.3

0.4

0.5

0.6

0.7

0.8

0.9

1

CC

DF

Proposed Algorithm

Baseline 1

Baseline 2

Baseline 3

BL1: Pr(delay >10)>0.08

BL3: Pr(delay >10)>0.24

BL2: Pr(delay >10)>0.12

Proposed: Pr(delay >10)<10-6

Fig. 15. CCDF of one-hop latency, small antenna array ([75, 24] c©2019 IEEE).

2 2.5 3 3.5 4 4.5 5

Mean arrival rate [Gbps]

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Aver

age

MB

S q

ueu

e le

ngth

[G

b] Proposed Algorithm

Baseline 1

Baseline 2

Baseline 3

Fig. 16. Average MBS queue length versus mean arrival rate ([24] c©2019 IEEE).

4.5.2 Large antenna array system

In order to achieve higher beamforming gain, large antenna arrays are employed at

both transmitter and receiver, i.e., Ni = 64, N j = 16. In this setting, the maximum

transmit power at the MBS is adjusted to 41 dBm only and the transmitter beamwidth

87

1 3 5 10 15 20 25 30 35 40

Latency [ms]

10-5

10-4

10-3

10-2

0.05

10-1

100

CC

DF

Proposed Algorithm - LOS

Proposed Algorithm - Blockage

Baseline 1 - LOS

Baseline 2 - LOS

Baseline 3 - LOS

All schemes satisfy

latency constraint

due to higher antenna gain

when mean arrival rate is small

Fig. 17. CCDF of one-hop latency, large antenna array, µ = 4.5 Gbps ([24] c©2019

IEEE).

is reduced to 0.5 radian. Our proposed algorithm is evaluated under both LOS and

blockage channel states, whereas all baselines are using the LOS communication model

[136, 49, 138], [139]. First, in Fig. 17 we plot the the CCDF of one-hop latency (in

logarithmic scale) of all schemes when the mean arrival rate is 4.5 Gbps, which is the

same mean admission rate as used in Fig. 25. Interestingly, due to higher antenna gains

all schemes do not violate the latency constraint with an upper bound of 10 ms and a

target probability of 5% as illustrated in Fig. 17. However, baseline 3 does not employ

the two important features (ii) dynamic path selection learning, and (iii) URLLC-aware

rate allocation, and thus, baseline 3 has a longer tail of latency distribution.

Next we increase the mean arrival rate to showcase the trade-off between latency

and network arrival rate. Fig. 18 reports the CCDF of one-hop latency of all schemes

with the increasing mean arrival rate, i.e., µ = 9.5. It can be observed that the perfor-

mance of our proposed algorithm is degraded under the impact of blockage channels in

which the distribution of the latency has a longer tail than baseline 1. With increasing

the mean arrival rate, baselines 2 and 3 violate the latency constraint with high probabili-

ties, such that Pr(latency> 10ms)> 10% for baseline 2 and Pr(latency> 10ms)> 20%

for baseline 3. The latency of all schemes increases as we increase the network arrival

rate, which showcases the trade-off between the latency and network arrival rate.

88

0 3 5 10 15 20 25 30 35 40 45 50 55

Latency [ms]

10-5

10-4

10-3

10-2

10-1

0.5

1

CC

DF

Proposed Algorithm - LOS

Proposed Algorithm - Blockage

Baseline 1 - LOS

Baseline 2 - LOS

Baseline 3 - LOS

Not all schemes meet

latency constraint

when mean arrival rate

is higher

Fig. 18. CCDF of one-hop latency, large antenna array, µ = 9.5 Gbps ([24] c©2019

IEEE).

4.5.3 Convergence characteristics

We plot the convergence of the iterative algorithm as a function of the number of hops

as shown in Fig. 19. Here, we provide the distribution of the number of iterations

of the SOCP-based algorithm in which the convergence criteria stops running with an

accuracy of 10−2. With increasing the number of hops, the number of constraints and

variables is increased, and thus the number of iterations required by the algorithm for

convergence is higher. Intuitively, our proposed algorithm only needs few iteration to

converge at each time slot t as shown in Fig. 19. For example, for three hop transmis-

sion, the probability that the number of iterations takes a value less than or equal to 7 is

90%.


In this chapter, the author proposed a multi-hop multi-path scheduling scheme to sup-

port reliable communication incorporating the probabilistic latency constraint and traf-

fic splitting techniques in 5G mmWave networks. In particular, the problem was mod-

eled as a network utility maximization subject to bounded latency with a guaranteed

reliability probability, and network stability. Massive MIMO and mmWave communi-

89

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Number of Iterations

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

CD

F

5 Hops - 21 BSs

3 Hops - 9 BSs

Fig. 19. The iterative algorithm convergence ([24] c©2019 IEEE).

cation techniques were employed to further improve the DL transmission of multi-hop

self-backhauled small cells. By leveraging stochastic optimization, the problem was

decoupled into PS and RA, which were solved by applying the reinforcement learning

and successive convex approximation methods, respectively. A comprehensive perfor-

mance analysis of our proposed algorithm was mathematically provided. Numerical

results show that our proposed framework significantly reduces the latency compared

to the baselines with and without learning, respectively.

This chapter addressed the problem of selecting the best paths from many possible

paths in multi-hop multi-path mmWave networks. The traffic aggregation was assumed

to be done perfectly, which is not practical, and thus a possible research direction is

to investigate the impact of imperfect traffic aggregation at the UEs. For simulation

purpose, a simple assumption was made when the traffic was spitted equally among the

flows. In fact, the weight of each traffic flow should be proportional to the route load or

other design metrics.

Moreover, the state-action space is much larger in ultra-dense networks, hence the pro-

posed reinforcement learning solution would be limited due to the relatively slow con-

vergence speed of reinforcement learning. Hence, in order to obtain to a faster solution,

the concept of deep reinforcement learning should be leveraged.

90

In the next chapters 5 and 6, the author takes a closer look at the access links and study

simplified scenarios. Specifically, chapter 5 considers a single cell massive MIMO

system, whereby a macro BS equipped with a large antenna array to serve multiple

outdoor users and the SCs are replaced by normal UEs. By doing so, the author focuses

on a simplified scenario to find a solution providing low latency communication with

eMBB services. Moreover, chapter 6 focuses on ultra-dense SC networks, in which

the macro cell layer is removed from the scenario. Chapter 6 studies the problem of

providing reliable communication, which aims to achieve high average mean rates, but

a small variance.

91

5 Low-latency communication in massive

MIMO wireless networks

This chapter examines a simplified scenario focusing on a single cell massive MIMO

wireless network in which wireless backhaul SCs are not considered, but the studied

problem can be straightforwardly extended to the multi-cell scenario where the SCs

act as normal UEs. To that end, this chapter answers the third question, Q3 of how to

provide low-latency communication for eMBB services, which is an essential issue in

5G wireless networks.


Most of the existing works on mmWave-enabled massive MIMO systems focus mainly

on providing capacity improvements, while latency and reliability are not addressed.

Although latency and reliability are applicable to many scenarios (e.g. mission-critical

applications), this chapter is concerned with addressing the fundamental question in

mmWave-enabled massive MIMO systems of how to simultaneously provide order of

magnitude capacity improvements and latency reduction. To this end, the Lyapunov

framework is extended to incorporate probabilistic latency constraints, taking into ac-

count the queue state, arrival rate, and channel dynamics with a guaranteed probability.

5.2 System model

Consider the downlink (DL) transmission of a single cell massive MIMO system11 con-

sisting of one macro base station (MBS) equipped with N antennas, and a set, M = 1,

. . . ,M, of single-antenna user equipments (UEs). We assume that N ≥ M and N ≫ 1.

Further, co-channel time-division duplexing (TDD) is considered in which the MBS

estimates channels via the uplink phase. We denote the propagation channel between

the MBS and the mth UE as Hm =√

NΘΘΘ1/2m Hm, where ΘΘΘm ∈CN×N depicts the antenna

spatial correlation, and the rank of spatial correlation matrix ΘΘΘm is much smaller than

11The studied model can be extended to multi-cell massive MIMO systems in which the problem of inter-cell

interference can be addressed by designing a hierarchical precoder at the MBS to mitigate both intra-cell and

inter-cell interference, or by applying an interference coordination approach [25].

93

number of antennas due to limited spatial scattering MIMO environment. Moreover, the

spatial channel model is clustered, which belongs to a finite set with a finite size [25].

The elements of Hm ∈ CN×1 are independent and identically distributed (i.i.d.) with

zero mean and variance 1/N. In addition, the channels experience flat and block fading,

and imperfect channel state information (CSI) is assumed. As per [41], the estimated

channel can be modelled as

Hm =√

1− τ2mHm + τm

√NΘΘΘ1/2

m zm,∀m ∈ M .

Here, zm ∈ CN×1 denotes the estimated noise vector which has i.i.d. elements with a

zero mean and a variance of 1/N, and τm ∈ [0,1] reflects the estimation error; in case

of perfect CSI, τm = 0.

Given the estimated channel matrix H = [H1, · · · ,HM] ∈ CN×M , the MBS employs

beamforming techniques to exploit the spatial multiplexing gains of a massive MIMO

system [25, 140]. Within the scope of this chapter, we consider a digital beamforming

scheme for a single cell massive MIMO system, whereas a hybrid beamforming design

can be applied for more complex systems, which is left for the future work [113, 114].

In particular, MBS utilizes the regularized zero-forcing (RZF) precoder with a precod-

ing matrix, V = [V1, · · · ,VM] ∈ CN×M , which is given by V =(H†H+Nζ IN

)−1H†

[25, 41]. Note that the regularization parameter ζ > 0 is scaled by N to ensure the ma-

trix H†H+Nζ IN is well-conditioned as N → ∞ [25]. Denoting all allocated powers in

the diagonal matrix P = diag(p1, · · · , pM), we get the constraint Tr(PV†V

)≤ P, with P

the maximum transmit power of the MBS. With the aid of the results in [41, Theorem

1], the transmit power constraint is derived as

1N

M

∑m=1

pmΩm

≤ P, and pm ≥ 0, ∀m ∈ M , (90)

where the parameter Ωm is the solution to Ωm = 1N

Tr(ΘΘΘm

(1N ∑M

m=1ΘΘΘm

ζ+Ωm+ IN

)−1). By

designing the precoding matrix V and transmit power vector p = (p1, · · · , pM), the Er-

godic DL rate of UE m ∈ M is expressed as

rm(p) =E[

log(

1+ pm|H†mVm|2

∑Mk=1,k 6=m pk|H†

mVk |2+σ 2m

)]

, (91)

Here, the thermal noise of user m is ηm ∼ CN (0,σ2m). The Ergodic DL rate in (91) in-

volves a stochastic expectation over a CSI realization and does not have a closed-form

expression [25]. We invoke results from random matrix theory to obtain the determin-

istic equivalence for the Ergodic DL rate [25, 41]. In particular, as N ≥ M and N ≫ 1,

94

for a small fixed ζ , the Ergodic DL rate almost surely converges to

rm(p)a.s.−−→ log

(

1+ pm(1− τ2m))

, ∀m ∈ M , (92)

wherea.s.−−→ denotes almost sure convergence [25], [41, Theorem 2]. Moreover, we

assume that the MBS has queue buffers to store the UE data [77]. The queue length for

UE m at time slot t is denoted by Qm(t) which evolves as follows

Qm(t + 1) = [Qm(t)− rm(t)]++ am(t), ∀m ∈ M , (93)

where am(t) is the data arrival rate of UE m. Further, we assume that am(t) is i.i.d. over

time slots with a mean arrival rate of am and upper bounded by amaxm [77].


According to Little’s law [125], the average latency is proportional to limT→∞

1

T

T

∑t=1

E[Qm(t)]/am.

We use Qm(t)/am as a latency measure and enforce an allowable upper bound dthm . Note

that the latency bound violation is related to reliability. Thus, taking into account the

latency and reliability requirements, we characterize the latency bound violation with

a tolerable probability. Specifically, we impose a probabilistic constraint on the queue

size length for UE m ∈ M as follows:

Pr

Qm(t)am

≥ dthm

≤ εm, ∀ t. (94)

In (94), dthm reflects the upper bound of UE latency requirement. Here, εm ≪ 1 is the

target probability for reliable communication.

To avoid the over-allocation of network resources to the UEs, i.e., rm(t) ≫ Qm(t),

we incorporate a maximum rate constraint rmaxm for each UE m, i.e., rmax

m := minrmaxm ,

Qm(t). Moreover, we enforce the MBS to guarantee for all UEs a certain level of QoS,

i.e., the minimum rate requirement rminm ,∀m ∈ M .

We define the network utility as ∑Mm=1 ωm f (rm), where rm = limT→∞

1T ∑T

t=1E[rm(t)]

denotes the time average expected rate and ωm represents the non-negative weight for

each UE m. Additionally, we assume that f (·) is a strictly concave, increasing, and

twice continuously-differentiable function. Taking into account these constraints pre-

95

sented above yields the following network utility maximization

OP3 : maxP(t)

M

∑m=1

ωm f (rm) (95a)

subject to rminm ≤ rm(t)≤ rmax

m , ∀m ∈ M , ∀ t, (95b)

(90) and (94).

Our main problem involves a probabilistic constraint (94), which cannot be addressed

tractably. To overcome this challenge, we apply Markov’s inequality [127] to linearize

(94) such that PrQm(t)

am≥ dth

m

≤ E[Qm(t)]

amdthm

. Then, (94) is satisfied if

E[Qm(t)]≤ amdthm εm, ∀m ∈ M , ∀ t. (96)

Thereafter, we consider (96) to represent the latency and reliability constraint. Assum-

ing that am(t)|∀ t ≥ 1 is a Poisson arrival process [127], we note that E[Qm(t)] =

tam −∑tτ=1 rm(τ) which is plugged into (96). Finally, we obtain

rm(t)≥ tam − amdthm εm −

t−1

∑τ=1

rm(τ), ∀m ∈ M , ∀t, (97)

which represents the minimum rate requirement in slot t for UE m for low latency

communication. Here, we transform the probabilistic latency and reliability constraint

(94) into one linear constraint (97) of instantaneous rate requirements, which helps to

analyse and optimize the URLLC problem. Combining (95b) and (97), we rewrite OP

as follows

maxP(t)

M

∑m=1

ωm f (rm) (98a)

subject to r0m(t)≤ rm(t)≤ rmax

m , ∀m ∈ M , ∀ t, (98b)

and (90),

with r0m(t) = maxrmin

m , tam − amdthm εm −∑t−1

τ=1 rm(τ).

5.4 Proposed control parameter selection and power allocation

To tackle (98), we resort to the Lyapunov framework [77]. Firstly, for each DL rate

rm(t), we introduce the auxiliary variable vector ϕϕϕ(t) = (ϕm(t)|∀m ∈ M ) that satisfies

ϕm = limT→∞

1T

T

∑t=0

E

[ϕm(t)

]≤ rm, ∀m ∈ M , (99)

ϕ0m(t)≤ ϕm(t)≤ rmax

m , ∀m ∈ M , ∀t, (100)

96

with ϕ0m(t) = maxrmin

m , tam− amdthm εm −∑t−1

τ=1 ϕm(τ). Incorporating the auxiliary vari-

ables, (98) is equivalent to

RP3 : maxP(t),ϕϕϕ(t)

limT→∞

1T

T

∑t=1

M

∑m=1

ωmE[ f (ϕm(t))]

subject to (90), (99), and (100).

In order to ensure the inequality constraint (99), a virtual queue vector Y(t)= (Ym(t)|∀m∈M ) is introduced, where each element evolves according to

Ym(t + 1) =[Ym(t)+ϕm(t)− rm(t)

]+, ∀m ∈ M . (101)

Subsequently, we express the conditional Lyapunov drift-plus-penalty for each time slot

t as:

E

[M

∑m=1

[12Ym(t + 1)2 − 1

2Ym(t)

2 −νm(t)wm f (ϕm(t))]∣∣Y(t)

]

. (102)

In (102), νm(t) is the control parameter which affects the utility-queue length trade-off.

This control parameter is conventionally chosen to be static and identical for all UEs

[77]. However, this setting does not hold for system dynamics (e.g., instantaneous data

arrivals) or the diverse system configurations (i.e., different latency and QoS require-

ments). Thus, we dynamically design these control parameters. From the analysis in

the Lyapunov optimization framework [77], we can find Ym(t) ≤ νm(t)ωmπm + amaxm

with πm being the largest first-order derivative of f (x). Letting ωm = 1,∀m ∈ M , we

have the lower bound πmνm(t) ≥ ν0m(t),∀m ∈ M , for selecting the control parameters,

where ν0m(t) = maxYm(t)− amax

m ,1. Subsequently, following the straightforward cal-

culations of the Lyapunov drift-plus-penalty technique, we obtain

(102) ≤E[

M

∑m=1

(Ym(t)ϕm(t)−νm(t)ωm f

(ϕm(t)

))(103a)

−M

∑m=1

Ym(t)rm

(P(t)

)+C∣∣Y(t)

]

. (103b)

Due to space limitation, we omit the details of the constant value C which does not

influence the system performance [77]. We note that the solution to LP is acquired

by minimizing the right-hand side (RHS) of (103a) and (103b) in every slot t. Further,

(103a) is related to the reliability and QoS requirements while (103b) reflects optimal

power allocation to UEs.

97

Algorithm 5.1 CCP algorithm for solving sub-problem (104) ([72] c©2017 IEEE).

1: m ∈ M

2: Initialize i = 0 and a feasible point ν(i)m in (104b).

3: repeat

4: Convexify g0(νm,ν(i)m ) = g0(ν

(i)m )+∇g0(νm −ν

(i)m ).

5: Solve:

6: minϕm,νm

h0(ϕm,νm)− g0(νm,ν(i)m )+Ymϕm

7: subject to (104b) and (104c),

8: Find the optimal ϕ(i)⋆m and ν

(i)⋆m .

9:

10: Update ν(i+1)m := ν

(i)⋆m and i := i+ 1.


5.4.1 Control parameters selection

Considering the logarithmic fairness utility function, i.e., f (x) = log(x), minimizing the

RHS of (103a) for each m ∈ M is formulated as

minϕm(t),νm(t)

Ym(t)ϕm(t)−νm(t) log(ϕm(t)

)(104a)

subject to πmνm(t)≥ ν0m(t), (104b)

r0m(t)≤ ϕm(t)≤ rmax

m . (104c)

Before proceeding with (104), we rewrite −νm(t) log(ϕm(t)) in (104a), for any ϕm(t)>

0 and νm(t)> 0, as

νm(t) log

(νm(t)

ϕm(t)

)

︸︷︷︸

h0(ϕm,νm)

−νm(t) log(νm(t)

)

︸︷︷︸

g0(νm)

,

in which both h0(ϕm,νm) (i.e., the relative entropy function) and g0(νm) (i.e., negative

entropy function) are convex functions. Since (104a) is the difference between convex

functions while constraints (104b) and (104c) are affine functions, problem (104) falls

under DC programming [141], which can be efficiently and iteratively addressed by the

CCP [142]. The CCP algorithm to obtain the solution to problem (104) is detailed in

Algorithm 5.1, which probably converges to the local optima of DC programming [142]

(please refer to [142] for the formal proof).

98

5.4.2 Power allocation

The optimal transmit power in (103b) is computed by

minP(t)

−M

∑m=1

Ym(t)rm(P(t))

subject to (90).

Here, the objective function is strictly convex for pm(t) ≥ 0,∀m ∈ M , and the con-

straints are compact. Therefore, the optimal solution of P⋆(t) exists.

After obtaining the optimal auxiliary variable and transmit power, we update the

queues Qm(t + 1) and Ym(t + 1) as per (93) and (101), respectively.


We consider a single-cell massive MIMO system in which the MBS, with N = 32 an-

tennas and P = 38 dBm, is located at the center of a 0.5× 0.5 km2 square area. UEs

(from 8 to 60 UEs per km2) are randomly deployed within the MBS’s coverage with a

minimum MBS-UE distance of 35 m. Data arrivals follow a Poisson distribution with

different means, and the rate requirements are specified as rmaxm = 1.2am,r

minm = 0.8am,

∀m∈ M . The system bandwidth is 1 GHz. The path loss is modeled as a distance-based

path loss with high probability from the line-of-sight (LOS) model for urban environ-

ments at 28 GHz [49]. dth and ε are set to 10ms and 5%, respectively. The numerical

results are obtained via Monte-Carlo simulations over 10000 channel realizations. Fur-

thermore, we compare our proposed scheme with the following baselines:

– Baseline 1 refers to the Lyapunov framework in which the probabilistic latency con-

straint (94) is considered.

– Baseline 2 is a variant of Baseline 1 without the probabilistic latency constraint (94).

5.5.1 Impact of the arrival rate

In Fig. 20, we report the average latency versus the mean arrival rates a = E[a(t)]

for M = 16. At a low a, no schemes violate latency constraints, and our proposed

algorithm outperforms other baselines with a small gap. At a higher a, the average

latency of baseline 2 increases dramatically as a > 1.8 Gbps, since baseline 2 does not

incorporate the latency constraint, whereas our proposed scheme reduces latency by

28.41% and 77.11% compared to baselines 1 and 2, respectively, when a = 2.4 Gbps.

99

1.6 1.8 2 2.2 2.4 2.6

Mean arrival rate [Gbps]

2

4

6

8

10

12

14

16

18

20

22

Av

erag

e la

ten

cy [

ms]

Proposed algorithm

Baseline 1

Baseline 2

dth

=10 ms, = 5%

Fig. 20. Average latency versus mean arrival rates, M = 16 per km2 ([72] c©2017 IEEE).

When a > 2.4 Gbps, the average latency of all schemes increases, violating the latency

requirement of 10 ms. It can be observed that under a limited maximum transmit power,

with a very high traffic demand, the latency requirement could not be guaranteed. This

highlights the tradeoff between the mean arrival rate and latency. In Fig. 21, we report

the tail distribution (complementary cumulative distribution function (CCDF)) of the

latency to showcase how often the system achieves a latency greater than target latency

levels. In particular, at a = 2.4 Gbps, by imposing the probabilistic latency constraint

(94), our proposed approach and baseline 1 ensure reliable communication with better

guaranteed probabilities, i.e, Pr(latency > 7.5ms) < 10−4 and Pr(latency > 9.4ms) <

10−4, respectively. In contrast, baseline 2 violates the latency constraint with a high

probability, where Pr(latency > 10ms) = 74.75%.

100

5 8 10 15 20 25 30 35 40 45

Latency [ms]

0.00010.050.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1C

CD

F

Proposed algorithm

Baseline 1

Baseline 2

Proposed algorithm

Baseline 1

Baseline 2

Proposed algorithm

Baseline 1

Baseline 2

BL2: Pr(delay > 10) = 74.75%

Proposed: Pr(delay > 7.4) < 1e-4

BL1: Pr(delay > 9.4) < 1e-4

λ = 2 Gbps

λ = 2.4 Gbps

λ = 2.6 Gbps

Fig. 21. Tail distribution (CCDF) of latency ([72] c©2017 IEEE).

5.5.2 Impact of user density

In Fig. 22, we compare the average user throughput (avgUT) and average latency of our

proposed approach with the two baselines under the impact of user density, when a = 2

Gbps. Additionally, we consider the weighted sum rate maximization (WSRM) case.

The WSRM case is used to find the system throughput limit but suffers from higher

latency. Since all users share the same resources, the average latency (“solid lines”)

increases with the number of users M, whereas the avgUT (“dash lines") decreases.

Fig. 22 further shows that when M > 24, the latency of all schemes increases dramati-

cally and is far-above the latency requirement. Hence, only a limited number of users

can be served to guarantee the latency requirement, above which, a tradeoff between

latency and network density exists. Our proposed approach achieves better through-

put and a higher latency reduction than baselines 1 and 2, while the WSRM case has

the worst latency performance as expected. Moreover, our proposed approach reaches

Gbps capacity, which represents the capacity improvement brought by the combination

of mmWave and massive MIMO techniques. Compared with WSRM, our proposed

approach maintains at least 87% of the throughput limit, while achieving up to 80%

latency reduction. Numerical results show that our approach simultaneously provides

order of magnitude capacity improvements and latency reduction.

101

8 12 16 20 24 28 32 36 40 44 48 52 56 60Number of nodes per km

2

0.5

1

1.5

2

2.5

3

3.5

Av

erag

e u

ser

thro

ug

hp

ut

[Gb

ps]

WSRM

Proposed algorithm

Baseline 1

Baseline 2

0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

Av

erag

e la

ten

cy [

ms]

0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

0

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

Throughput

Latency

Fig. 22. Average latency and avgUT versus number of users per km2 ([72] c©2017 IEEE).


This chapter investigated the problem of mmWave-enabled massive MIMO networks

from a latency standpoint. Specifically, the problem was modeled as a NUM problem

subject to the probabilistic latency constraint and QoS/rate requirement. Numerical

results show that the proposed approach reduces the latency by 28.41% and 77.11%

compared to current baselines.

The proposed solution can be straightforwardly extended to a multi-cell scenario in

which the problem of inter-cell interference can be addressed by designing a hierarchi-

cal precoder to mitigate both intra-cell and inter-cell interference, or by applying an

interference coordination approach. In addition, achieving lower latency communica-

tion, multi-connectivity and antenna diversity should be investigated in the future work.

This chapter addressed the problem of low latency communication. In the next chapter,

the last question Q4 will be answered to provide more reliable communication in ultra-

dense SC networks in the presence of risk and uncertainty.

102

6 Ultra-reliable communication in 5G mmWave

networks

This chapter addresses another key concern in 5G wireless networks, which is reliability.

Specifically, research question Q4 is answered, which enables ultra-reliable communi-

cation in ultra-dense SC networks in the presence of risk and uncertainty. Note that for

the sake of simplification, this chapter does not consider the macro cell, but the pro-

posed approach can be applied directly to other studied scenarios in previous chapters.


A unique peculiarity of mmWave bands is that mmWave links are very sensitive to

blockage, which gives rise to unstable connectivity and unreliable communication. To

overcome this challenge, the author leverages principles of risk-sensitive reinforcement

learning (RSL) and exploits multiple antenna diversity and higher bandwidth to opti-

mize the transmission to achieve gigabit data rates, while considering the sensitivity of

mmWave links to provide ultra-reliable communication (URC). The prime motivation

behind using RSL stems from the fact that the risk-sensitive utility function to be op-

timized is a function of not only the average but also the variance [143], and thus it

captures the tail of rate distribution to thus enable URC.

Related work

In [5] the authors provided principles of wireless communication to support URLLC

such as the use of antenna diversity, network base station densification, and flexible

frame/network designs. [6] briefly defined the latency and reliability concepts, and fur-

ther described some techniques to support URLLC with respect to the risk, tail and scale.

In particular, the risk involves the decision making under uncertainly in the presence of

highly fluctuating channel and network dynamics; the tail is related to the tail behaviour

of random traffic arrival or rate distributions under worse channel state; and scale is con-

nected to the case when large numbers of devices are deployed, which requires URLLC

that poses resource allocation and network design challenges [6]. Recently, the problem

of low latency communication [144] and URLLC [72] for 5G mmWave networks was

103

studied to evaluate the performance under the impact of traffic dispersion and network

densification. All these works focus on maximizing the time average of the network

throughput or minimizing the mean latency without providing any guarantees for higher

order moments (e.g., variance, skewness, kurtosis, etc.). This chapter departs from the

classical average-based system design and instead takes account of higher order mo-

ments in the utility function to formulate an RSL framework in which every small cell

optimizes its transmission while taking into account signal fluctuations.

6.2 System model

Let us consider the mmWave downlink (DL) transmission of a small cell network con-

sisting of a set B of B small cells (SCs), and a set K of K user equipments (UEs)

equipped with Nk antennas. We assume that each SC is equipped with a large number

of Nb antennas to exploit massive MIMO gain and adopt a hybrid beamforming archi-

tecture [120], and we assume that Nb ≫ Nk ≥ 1 . Without loss of generality, one UE per

one SC is considered12. The data traffic is generated from the SC to UE via mmWave

communication. A co-channel time-division duplexing protocol is considered, in which

the DL channel can be obtained via the uplink training phase.

Each SC adopts the hybrid beamforming architecture, which enjoys both analog and

digital beamforming techniques [120]. Let g(tx)bk and g

(rx)bk denote the analog transmitter

and receiver beamforming gains at the SC b and UE k, respectively. In addition, we

use ω(tx)bk and ω

(rx)bk to represent the angles deviating from the strongest path between

the SC b and UE k. Also, let θ(tx)bk and θ

(rx)bk denote the beamwidth at the SC and UE,

respectively. We denote θθθ as a vector of the transmitter beamwidth for all SCs. We

adopt the widely used antenna radiation pattern model [120] to determine the analog

beamforming gain as

gbk (ωbk,θbk) =

2π−(2π−θbk)Γθbk

, if |ωbk| ≤ θbk2,

Γ, otherwise,(106)

where 0 < Γ ≪ 1 is the side lobe gain.

Let Hbk ∈ CNb×Nk denote the channel propagation matrix (channel state) from SC b

to UE k. We assume a time-varying channel state described by a Markov chain and

there are T ∈ Z+ states, i.e., for each Hbk(t), t = 1, . . . ,T. Considering the imperfect

12For the multiple UE case, additional channel estimation and user scheduling need to be considered. One

example was studied in [23].

104

channel state information (CSI), the estimated channel state between the SC b and UE

k is modelled as [72]

Hbk =√

Nb ×NkΘΘΘ1/2

bk

(√

1− τ2k Wbk + τkWbk

)

,

where ΘΘΘbk ∈ CNb×Nb is the spatial channel correlation matrix with a low rank that ac-

counts for the mmWave channel path loss and shadow fading [25, 136]. Moreover, the

spatial channel model is clustered, which belongs to a finite set with a finite size [25].

Here, Wbk ∈ CNb×Nk is the small-scale fading channel matrix, modelled as a random

matrix with a zero mean and a variance of 1Nb×Nk

. Here τk ∈ [0,1] reflects the estima-

tion accuracy for UE k, if τk = 0, and we assume perfect channel state information.

Wbk ∈ CNb×Nk is the estimated noise vector, also modeled as a random matrix with

a zero mean and a variance of 1Nb×Nk

. We denote H = Hbk|∀b ∈ B,∀k ∈ K as the

network state.

By applying a linear precoding scheme Vbk(Hbk) [120], i.e, Vbk(Hbk) = Hbk for the

conjugate precoding, the achievable rate13 of UE k from SC b can be calculated as

rb (t) = wlog(

1+pbg

(tx)bk g

(rx)bk |H†

bkVbk|2

∑b′ 6=b pb′g(tx)b′k g

(rx)b′k |H†

b′kVb′k|2 +σ2bk

)

,

where pb and pb′ are the transmit powers of SC b and SC b′, respectively. In addition,

w denotes the system bandwidth of the mmWave frequency band. The thermal noise

of user k served by SC b is ηbk ∼ CN (0,σ2bk) . Here, we denote Pmax

b as the maximum

transmit power of SC b and p = (pb|∀b ∈ B, 0 ≤ pb ≤ Pmaxb ) as the transmit power

vector.


We model a decentralized optimization problem and harness tools from RSL to solve

it, whereby the SCs autonomously respond to the network states based on the his-

torical data. Let us consider a joint optimization of transmitter beamwidth14 θθθ and

transmit power allocation p. We denote z(t) = (θθθ (t) ,p(t)), which takes values in

Z = z1, · · · ,zB, where zb =(θb, pb). Assume that each SC b selects its beamwidth and

transmit power drawn from a given probability distribution πππb =(π1

b , · · · ,πmb , · · · ,π

Zb

b

)

13Note that we omit the beam search/track time, since it can be done in a short time compared to transmission

time [119]. We assume that each BS sends a single stream to its users via the main beams.14As studied in [120], for η ≤ 1

3, the problem of selecting the beamwidth for the transmitter and receiver can

be done by adjusting the transmitter beamwidth with a fixed receiver beamwidth.

105

in which Zb is the cardinality of the set of all combinations (θb, pb), i.e., ∑Zbm=1 πm

b = 1.

For each m = 1, · · · ,Zb and zmb = (θ m

b , pmb ) the mixed-strategy probability is defined

as

πmb (t) = Pr

(

zb(t) = zmb |zb(0 : t − 1),πππb(0 : t − 1)

)

. (107)

We denote πππ = πππ1, · · · ,πππb, · · · ,πππB ∈ Π, in which Π is the set of all possible probabil-

ity mass functions (PMF). Let r = (r1, · · · ,rB) denote the instantaneous rates, in which

rb = (rb(0), · · · ,rb(T )). Let R denote the rate region, which is defined as the convex

hull of the rates [131], i.e., r∈ R . Inspired by the RSL [143], we consider the following

utility function, given by

ub =1

µb

logEH,πππ

[

exp(µb

T

∑t=0

rb(t))

]

, (108)

where the parameter µb < 0 denotes the desired risk-sensitivity, which will penalize the

variability [143] and the operatorE denotes the expectation operation.

Remark 6.1. The Taylor expansion of the utility function given in (108) yields

ub ,EH,πππ

[T

∑t=0

rb(t)

]

+µb

2VarH,πππ

[T

∑t=0

rb(t)

]

+O(µ2

b

).

Remark 1 basically shows that the utility function (108) considers both mean and vari-

ance terms (Var) of the mmWave links. We formulate the following distributed opti-

mization problem for every SC as

OP4: maxπππb

1

µb

logEH,πππb

[

exp(µb

T

∑t=0

rb(t))]

(109a)

subject to rb ∈ R , πππb ∈ Π, pb ≤ Pmaxb . (109b)

It is challenging to solve (109) if each SC is not able to fully observe the network

observation. This work does not assume an explicit knowledge of the state transition

probabilities. Here, we leverage the principles of RL to optimize the transmit beam in

a totally decentralized manner [143, 80, 94].

6.4 Proposed distributed learning algorithm

In Fig. 23 each SC acts as an agent which selects an action to maximize a long-term

reward based on user feedback and a probability distribution for each action. The action

106

Agent

Action (t)Observation

Environment

Reward (t)

NewState(t+1)

t t+ 10 T − 1 T

New State

Feedback

Uplink training phase Downlink transmission phase Uplink transmission and feedback phase

Time indices for each Episode

t t+ 1

:::

Episode 1

Episode 2

Episode 3

NLOS

LOS

NLOS Episode representation for simulation

:::

Fig. 23. A reinforcement learning model ([73] c©2018 IEEE).

is defined as the selection of zb, while the long-term utility in (109) is the reward, and

the environment here contains the network state. To this end, we build the probability

distribution for every action and provide a RL procedure to solve (109). We denote

umb = um

b

(zm

b ,z−b

)as a utility function of SC b when selecting zm

b . Here, z−b denotes

the composite variable of other agents’ actions excluding SC b. From (108), the utility

ub (t) of SC b at time slot t, i.e., ub = ∑Tt=0 ub (t), is rewritten as

ub (t) =1

µb

log

(Zb

∑m=1

πmb exp

(

µbrmb

(zm

b (t) ,z−b

)))

, (110)

where rmb (z

mb (t) ,z−b) is the instantaneous rate of SC b when choosing zm

b (t) = (θ mb (t) ,

pmb (t)) with a probability of πm

b (t).

Remark 6.2. For a small µb, (108) is approximated via the Taylor approximation15 of

rb around µb −→ 0 as

ub =1

µb

E

[T

∑t=0

(exp(µbrb(t)

)− 1)

]

, (111)

=1

(T + 1)

T

∑t=0

exp(µbrb(t)

)− 1

µb

, (112)

15For a small x > 0, the Taylor approximation of log (x) is x−1.

107

where (112) is obtained by expanding the time average of (111). Each SC determines

(θ mb , pm

b ) from Zb based on the probability distribution from the previous stage t − 1,

i.e.,

πππb (t − 1) =(

π1b (t − 1) , · · · ,πZb

b (t − 1))

. (113)

We introduce the Boltzmann-Gibbs (BG) distribution to capture the exploitation and

exploration, βββ b (ub(t)), given by

βββ mb (ub(t)) = argmax

πππb∈Π∑

m∈zb

[

πmb um

b (t)−κbπmb ln(πm

b )]

, (114)

where ub(t) =(

u1b (t) , · · · ,u

Zb

b (t))

is the utility vector of SC b for zb ∈ Zb, and the

trade-off factor κb is used to maintain the balance between exploration and exploitation.

If κb is small, the SC selects zb with highest payoff. For κb → ∞ all decisions have

equal chance.

For a given ub(t) and κb, we solve (114) to find the probability distribution, and by

adopting the notion of logit equilibrium [80, 94], we have

β mb (ub(t)) =

exp(

1κb

[um

b

]+)

∑m′∈Zb

exp(

1κb

[um′

b

]+) , (115)

where [x]+ ≡ max[x,0]. Finally, we propose two coupled RL processes that run in

parallel and allow SCs to decide their optimal strategies at each time instant t as follows

[80, 94].

Risk-sensitive learning procedure: We denote ub(t) as the estimate utility of SC b,

in which the estimate utility and probability mass function are updated for each action

m ∈ Zb as follows:

umb (t) = um

b (t − 1)+ ι (1)b (t)1zb(t)=zm

b×(ub(t − 1)− um

b (t − 1)),

πmb (t) = πm

b (t − 1)+ ι (2)b (t)

(β m

b (ub(t))−πmb (t − 1)

),

where ι (1)b(t) and ι (2)

b(t) are the learning rates which satisfy the following conditions

(due to space limits please see [80, 94] for convergence proof):

limT→∞ ∑Tt=0 ι (1)

b (t) = +∞, limT→∞ ∑Tt=0 ι (2)

b (t) = +∞.

limT→∞ ∑Tt=0 ι (1)2

b(t) = +∞, limT→∞ ∑t

t=0 ι (2)2b

(t) = +∞.

limt→∞ι(2)b

(t)

ι(1)b

(t)= 0.

Finally, each SC determines zmb as per (113).

108


Dense SCs are randomly deployed in a 0.5× 0.5 km2 area and we assume one UE per

each SC with a fixed user association. We assume that each SC adjusts its beamwidth

with a step of 0.05 radian from the range [θ min, θ max], where θ min = 0.2 radian and

θ max = 0.4 radian denote the minimum and maximum beamwidths of each SC, respec-

tively. The transmit power level set of each SC is 21, 23, 25 dBm and the SC antenna

gain is 5 dBi. The number of transmit antennas Nb and receive antennas Nk at the SC and

UE are set to 64 and 4, respectively. The blockage is modeled as a distance-dependent

probability state where the channel is either line-of-sight (LOS) or non-LOS for urban

environments at 28 GHz and the system bandwidth is 1 GHz [136]. Numerical results

are obtained via Monte-Carlo simulations over 50 different random topologies. The

risk-sensitive parameter is set to µb = −2. For the learning algorithm, the trade-off

factor κb is set to 5, while the learning rates ι (1)b (t) and ι (2)

b (t) are set to 1

(t+1)0.55 and

1

(t+1)0.6 , respectively [94]. Furthermore, we compare our proposed RSL scheme with

the following baselines:

– Classical Learning (CSL) refers to the RL framework in which the utility function

only considers the mean value of mmWave links [94].

– Baseline 1 (BL1) refers to [120] optimizing the beamwidth with maximum transmit

power.

In Fig. 25, we plot the complementary cumulative distribution function (tail distribu-

tion - CCDF) of user throughput (UT) at 28 GHz when the number of SCs is 24 per

km2. The CCDF curves reflect the reliable probability (in both linear and logarithmic

scales), defined as the probability that the UT is higher than a target rate r0 Gbps, i.e,

Pr(UT≥r0). We also study the impact of imperfect CSI with τk = 0.3 and feedback

with noise from UEs. We observe that the performance of our proposed RSL frame-

work is reduced under these impacts. We next compare our proposed RSL method with

other baselines with perfect CSI and user feedback. It is observed that the RSL scheme

achieves better reliability, Pr(UT≥10 Gbps), of more than 85%, whereas the baselines

CSL and BL1 obtain less than 75% and 65%, respectively. However, at very low rate

(less than 2 Gbps) or very high rate (10.65− 11 Gbps) captured by the cross-point, the

RSL obtains a lower probability as compared to the baselines. In other words, our pro-

posed solution provides a UT which is more concentrated around its median in order to

provide uniformly great service for all users. For instance, the UT distribution of our

109

16 32 48 64 80 96 112 128Number of SCs per km

2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rel

iabil

ity:

Ur

r0

/U

Risk-sensitive learning

Classical learning

Baseline 1

16 32 48 64 96 1280.98

0.99

1

r0 = 2 Gbps

r0 = 3 Gbps

r0 = 4 Gbps

Fig. 24. Reliability versus network density ([73] c©2018 IEEE).

Fig. 25. Tail distribution of the achievable rate, B = 24 ([73] c©2018 IEEE).

proposed algorithm has a small variance of 0.4846, while the CSL has a higher variance

of 2.6893.

Fig. 24 reports the impact of the network density on the reliability, which is defined

as the fraction of UEs who achieve a given target rate r0, i.e.,Kr>r0

K. Here, the number of

SCs varies from 16 to 128 per km2. For given target rates of 2, 3, and 4 Gbps, our pro-

posed algorithm guarantees higher reliability compared to the baselines. Moreover, the

higher the target rate, the bigger the performance gap between our proposed algorithm

110

16 32 48 64 80 96 112 128Number of SCs per km

2

2

3

4

5

6

7

8

9

10

Avai

labil

ity [

Gbps]


Classical learning

Baseline 1

80% availability

90% availability

Fig. 26. Availability versus network density ([73] c©2018 IEEE).

and the baselines. A linear increase in the network density reduces reliability, for exam-

ple, when the density increases from 16 to 96, the fraction of users that achieve 4 Gbps

of the RSL, CSL, and BL1 are reduced by 11.61%, 16.72%, and 39.11%, respectively.

This highlights a key trade-off between reliability and network density.

In Fig. 26 we show the impact of the network density on the availability, which

defines what rate is obtained for a target probability. We plot the 80% and 90% proba-

bilities in which the system achieves a rate of at least r Gbps. For a given target prob-

ability of 90%, our proposed algorithm guarantees more than 9 Gbps of UT, whereas

the baselines guarantee less than 7.5 Gbps of UT for B = 16, while if we lower the

target probability to 80%, the achievable rate is increased by 5%. This gives rise to a

tradeoff between the reliability and the data rate. In addition, for a given probability, the

achievable rate r is reduced with an increase in network density. For instance, when the

network density increases from 16 to 80, the achievable rate is reduced by 50%. This

highlights the tradeoff between availability and network density.

We numerically observe that T = 4000 is long enough for the agents to learn and

enjoy the optimal solution. We assume that the channel condition is changed after every

T = 4000. Our proposed algorithm converges faster than the classical learning baseline

as shown in Fig. 27. By harnessing the risk-averse notion, the agents attemp to find the

best strategy subject to the variations of the mmWave rates.

111

1200500 1000 1500 2000 2500 3000 3500 4000

Iterations

0

2

4

6

8

10

12

14

Ach

ievab

le R

ate

[Gbps]


Classical learning

Proposed algorithm converges faster

Fig. 27. Convergence of the proposed RSL and classical RL ([73] c©2018 IEEE).


This chapter studied the problem of providing multi-gigabit wireless access with reli-

able communication by optimizing the transmit beam and considering the link sensi-

tivity in 5G mmWave networks. A distributed risk-sensitive RL based approach was

proposed taking into account both mean and variance values of the mmWave links. Nu-

merical results show that our proposed approach provides better services for all users.

For instance, the proposed approach achieves a Pr(UT≥ 10Gbps) which is higher than

85%, whereas the baselines obtain less than 75% and 65% with 24 small cells.

As studied in Chapter 4, the proposed reinforcement learning algorithm works only in

static and sparse networks. In a high mobility environment, a fast convergent solution is

required. Together with the problem of beam selection and power allocation, the beam

tracking and alignment become more challenging in high mobility mmWave networks.

112

7 Conclusions and future work

This chapter concludes the thesis and provides several future directions in view of up-

coming 5G wireless systems and beyond.

7.1 Conclusions

The focus of this thesis is to propose an integrated access-backhaul architecture for the

deployment of 5G wireless networks and beyond. As networks become denser in terms

of the numbers of users and base stations, it is highly challenging to implement network

planning and optimization. To achieve this, joint resource allocation and interference

mitigation schemes were proposed to answer the aforementioned fundamental questions

in Chapters 3-6 under different network architectures. By leveraging three key enabling

technologies, namely mmWave communication, massive MIMO, and dense small cells,

this thesis finds solutions to provide high data rates, low latency, and high reliability.

The research results also provide deployment guidelines for 5G wireless networks and

beyond, summarized as follows:

Chapter 3 answered the questions: for a given target UE throughput, what the opti-

mal number of UEs to be scheduled and what the optimal/maximum number of SCs to

be deployed would be. The studied problem was decoupled into the dynamic schedul-

ing of UEs, the backhaul provisioning of in-band FD-enabled SCs, and offloading UEs

to in-band FD-enabled SCs as a function of interference, as well as number of antennas,

and backhaul loads. In addition, the results show that at higher frequency bands FD-

enabled SCs work better in an open access mode than in a closed access mode under

the same transmit power budget. In particular, with increasing SC density, open access

FD-enabled SCs achieve 5.6× gains in terms of cell-edge performance compared to the

closed access ones in ultra-dense networks with 350 small cell base stations per km2.

Chapter 4 provided solutions for the problem of multi-hop multi-path transmis-

sions in mmWave networks. In particular, the solutions provide guidelines for selecting

the best paths between possible paths and how to assign transmission rates over these

paths, while satisfying probabilistic latency constraint and maintaining network stabil-

ity. Reinforcement learning was employed by utilizing the benefits of historical infor-

mation to select the best paths based on their empirical distributions. A probabilistic

113

latency constraint was incorporated into the rate allocation problem, so that an upper

latency bound could be guaranteed within a small reliable probability.

In 5G networks and beyond, an important concern is how to support ultra-low la-

tency and highly reliable communications. Chapter 5 discussed the latency issue in

mmWave-enabled massive MIMO systems in which a latency bound violation was

characterized with a tolerable probability. The research results demonstrated that for

a limited maximum transmit power, with very high traffic demands, the latency require-

ment could not be guaranteed. This highlights the tradeoff between the mean arrival

rate and latency. In addition, only a limited number of users can be served to guarantee

the delay requirement, above which, a trade-off between latency and network density

exists.

In Chapter 6, a novel solution was proposed to provide Gbps data transmission with

reliable communication in mmWave environments, where the channels are highly fluc-

tuational and the links are sensitive to blockages. A new approach departs from the clas-

sical average-based system design and instead takes account of higher order moments

in the utility function to formulate a risk-sensitive reinforcement learning framework

through which every small cell exploits the diversity of multiple antennas and higher

bandwidth to optimize theirs transmission while taking into account signal fluctuations.

In particular, the proposed solution provided a UT which is more concentrated around

its median to support a uniformly high level of service for all users. For instance, the UT

distribution of our proposed algorithm has a small variance of 0.4846, while the CSL

has a higher variance of 2.6893. The results established important trade-offs between

network density and reliability/availability.

In summary, an integrated access-backhaul architecture was proposed for the de-

ployment of future networks. In particular, as networks grow denser in terms of users

and small cell base stations, the proposed IAB architecture simultaneously schedules

the users and provides a wireless backhaul for the dense deployment of small cells.

In this regard, joint resource allocation and interference mitigation solutions were pro-

posed for two-hop and multi-hop self-backhauled millimeter Wave (mmWave) networks.

Further, the thesis provides solutions to support low latency and reliable communica-

tions, where the key trade-offs were established such as between network density and

latency/reliability.

114

7.2 Future work

First, some ideal assumptions in this thesis should be pointed out. In particular, the self-

interference cancellation, backhaul synchronization, channel reciprocity, and traffic ag-

gregation were assumed to be done perfectly. One of the most important extensions is to

investigate the impact of imperfect SIC on the performance of IAB systems and further

develop algorithms, seeking a near-perfect SIC performance. The imperfect backhaul

synchronization/traffic aggregation causes additional latency. In fact, the above assump-

tions made in the thesis, provide the upper bounds for the achievable performance in

practice, and thus, future work should take these non-ideal assumptions into account to

bridge the performance gap between theory and practice. Moreover, when considering

the beamforming techniques for mmWave communications, Chapters 3 and 5 studied a

simple system model in which digital beamforming was employed, while analog beam-

forming was not properly introduced. In this regard, future work would be to employ

hybrid beamforming techniques to improve the beamforming gain and reduce the power

consumption and hardware cost with limited number of radio-frequency chains.

Finally, some possible research directions are listed as follows:

– There is a need to evaluate the impact of imperfect SIC on the IAB system for both

TDD and FDD protocols.

– In mmWave communications, in high mobility environments, the coherence time is

much shorter, and thus, an ultra-fast and efficient beamforming tracking and align-

ment is of high importance.

– The proposed reinforcement learning algorithms allow a distributed manner for indi-

vidual network elements to independently operate. However, the main drawback of

RL is its slow convergence speed when the state-action spaces are large. Especially,

dynamic networks with high mobility demanding high reliability and low latency

require optimal solutions in a reasonable time. In this regard, deep reinforcement

learning is a promising solution to obtain a faster convergence speed and handle a

large number of state-action pairs.

– A large network optimization problem becomes extremely challenging, and even in

higher frequency band environments, the interference can be less severe. In particular,

by involving a large number of users, a dense deployment of base stations, with high

mobility, varying traffic demands and QoS requirements, the problem of universal

load balancing and interference management (ULBIM) becomes extremely complex

as multi-variables are not easy to decouple. Unavoidably, 5G and beyond networks

115

are seeking new powerful tools to solve the ULBIM problem. In this regard, artifi-

cial intelligence and machine learning are currently being investigated for wireless

networks.

116

References

[1] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and J. C. Zhang,

“What will 5G be?” IEEE Journal on Selected Areas in Communications, vol. 32, no. 6,

pp. 1065–1082, June 2014.

[2] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski, “Five disruptive

technology directions for 5G,” IEEE Communications Magazine, vol. 52, no. 2, pp. 74–80,

Feb. 2014.

[3] P. Popovski, “Ultra-reliable communication in 5G wireless systems,” Proceedings - Inter-

national Conference 5G for Ubiquitous Connectivity (5GU), pp. 146–151, 2014.

[4] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable, and low-latency wire-

less communication with short packets,” Proceedings of the IEEE, vol. 104, no. 9, pp.

1711–1726, 2016.

[5] P. Popovski, J. J. Nielsen, C. Stefanovic, E. de Carvalho, E. Strom, K. F. Trillingsgaard,

A.-S. Bana, D. M. Kim, R. Kotaba, J. Park et al., “Wireless access for ultra-reliable low-

latency communication: Principles and building blocks,” IEEE Networks, vol. 32, no. 2,

pp. 16–23, 2018.

[6] M. Bennis, M. Debbah, and H. V. Poor, “Ultra-reliable and low-latency wireless commu-

nication: Tail, risk and scale,” Proceedings of the IEEE, vol. 106, no. 10, pp. 1834–1853,

2018.

[7] C. Bockelmann, N. Pratas, H. Nikopour, K. Au, T. Svensson, C. Stefanovic, P. Popovski,

and A. Dekorsy, “Massive machine-type communications in 5G: Physical and MAC-layer

solutions,” IEEE Communications Magazine, vol. 54, no. 9, pp. 59–65, 2016.

[8] Z. Dawy, W. Saad, A. Ghosh, J. G. Andrews, and E. Yaacoub, “Toward massive machine

type cellular communications,” IEEE Wireless Communications, vol. 24, no. 1, pp. 120–

128, 2017.

[9] K. Miyauchi, “Millimeter-Wave Communication,” Infrared and millimeter waves, vol. 9,

1983.

[10] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz,

M. Samimi, and F. Gutierrez Jr, “Millimeter wave mobile communication for 5G cellular:

It will work!” IEEE Access, vol. 1, pp. 335–349, 2013.

[11] Y. Niu, Y. Li, D. Jin, L. Su, and A. V. Vasilakos, “A survey of millimeter wave communi-

cation (mmWave) for 5G: opportunities and challenges,” Wireless Networks, vol. 21, no. 8,

pp. 2657–2676, 2015.

[12] F. Gómez-Cuba, E. Erkip, S. Rangan, and F. J. González-Castaño, “Capacity scaling of

cellular networks: Impact of bandwidth, infrastructure density and number of antennas,”

IEEE Transactions on Wireless Communications, vol. 17, no. 1, pp. 652–666, 2018.

117

[13] M. Xiao, S. Mumtaz, Y. Huang, L. Dai, Y. Li, M. Matthaiou, G. K. Karagiannidis, E. Björn-

son, K. Yang, I. Chih-Lin et al., “Millimeter wave communications for future mobile net-

works,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 9, pp. 1909–

1935, 2017.

[14] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station

antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–3600,

2010.

[15] J. Hoydis, K. Hosseini, S. Brink, and M. Debbah, “Making smart use of excess antennas:

Massive MIMO, Small Cells, and TDD,” Bell Labs Technical Journal, vol. 18, no. 2, pp.

5–21, 2013.

[16] F. Rusek, D. Persson, B. Lau, E. Larsson, T. Marzetta, O. Edfors, and F. Tufvesson, “Scal-

ing up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Process-

ing Magazine, vol. 30, no. 1, pp. 40–60, 2013.

[17] V. Chandrasekhar, J. G. Andrews, and A. Gatherer, “Femtocell networks: a survey,” IEEE

Communications Magazine, vol. 46, no. 9, 2008.

[18] J. G. Andrews, “Seven ways that hetnets are a cellular paradigm shift,” IEEE Communica-

tions Magazine, vol. 51, no. 3, pp. 136–144, 2013.

[19] A. Anpalagan, M. Bennis, and R. Vannithamby, Design and deployment of small cell net-

works. Cambridge University Press, 2015.

[20] M. Bennis, M. Simsek, A. Czylwik, W. Saad, S. Valentin, and M. Debbah, “When cellular

meets WiFi in wireless small cell networks,” IEEE Communications Magazine, vol. 51,

no. 6, pp. 44–50, 2013.

[21] K. Hosseini, J. Hoydis, S. Brink, and M. Debbah, “Massive MIMO and Small Cells:

How to densify heterogeneous networks,” Proceedings - IEEE International Conference

on Communications (ICC), pp. 5442–5447, 2013.

[22] N. Bhushan, J. Li, D. Malladi, R. Gilmore, D. Brenner, A. Damnjanovic, R. Sukhavasi,

C. Patel, and S. Geirhofer, “Network densification: The dominant theme for wireless evo-

lution into 5G,” IEEE Communications Magazine, vol. 52, no. 2, pp. 82–89, 2014.

[23] T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah, and M. Latva-aho, “Joint load balancing

and interference mitigation in 5G heterogeneous networks,” IEEE Transactions on Wire-

less Communications, vol. 16, no. 9, pp. 6032–6046, 2017.

[24] T. K. Vu, M. Bennis, M. Debbah, and M. Latva-aho, “Joint path selection and rate al-

location framework for 5G self-backhauled mmWave networks,” IEEE Transactions on

Wireless Communications, vol. 18, no. 4, pp. xxxx–xxxx, 2019.

[25] A. Liu and V. Lau, “Hierarchical interference mitigation for massive MIMO cellular net-

works,” IEEE Transactions on Signal Processing, vol. 62, no. 18, pp. 4786–4797, 2014.

118

[26] D. López-Pérez, A. Valcarce, G. De La Roche, and J. Zhang, “OFDMA femtocells: A

roadmap on interference avoidance,” IEEE Communications Magazine, vol. 47, no. 9,

2009.

[27] N. Saquib, E. Hossain, L. B. Le, and D. I. Kim, “Interference management in OFDMA fem-

tocell networks: Issues and approaches,” IEEE Wireless Communications, vol. 19, no. 3,

2012.

[28] C. H. de Lima, M. Bennis, and M. Latva-aho, “Coordination mechanisms for self-

organizing femtocells in two-tier coexistence scenarios,” IEEE Transactions on Wireless

Communications, vol. 11, no. 6, pp. 2212–2223, 2012.

[29] T. K. Vu, K. Sungoh, and O. Sangchul, “Cooperative interference mitigation algorithm

in heterogeneous networks,” IEICE Transation on Communications, vol. 98, no. 11, pp.

2238–2247, 2015.

[30] E. Bastug, M. Bennis, M. Kountouris, and M. Debbah, “Cache-enabled small cell net-

works: Modeling and tradeoffs,” EURASIP Journal on Wireless Communications and Net-

working, vol. 2015, no. 1, p. 41, 2015.

[31] D. Bharadia, E. McMilin, and S. Katti, “Full duplex radios,” in ACM SIGCOMM Computer

Communication Review, vol. 43, no. 4. ACM, 2013, pp. 375–386.

[32] A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, “In-band

full-duplex wireless: Challenges and opportunities,” IEEE Journal on selected areas in

communications, vol. 32, no. 9, pp. 1637–1652, 2014.

[33] L. Song, R. Wichman, Y. Li, and Z. Han, Full-duplex communications and networks.

Cambridge University Press, 2017.

[34] G. R. Kenworthy, “Self-cancelling full-duplex RF communication system,” 1997, uS

Patent 5,691,978.

[35] S. Hong, J. Brand, J. I. Choi, M. Jain, J. Mehlman, S. Katti, and P. Levis, “Applications

of self-interference cancellation in 5G and beyond,” IEEE Communications Magazine,

vol. 52, no. 2, pp. 114–121, 2014.

[36] G. Liu, F. R. Yu, H. Ji, V. C. Leung, and X. Li, “In-band full-duplex relaying: A survey,

research issues and challenges,” Resource, vol. 147, p. 172, 2015.

[37] Z. Zhang, X. Chai, K. Long, A. V. Vasilakos, and L. Hanzo, “Full duplex techniques for

5G networks: self-interference cancellation, protocol design, and relay selection,” IEEE

Communications Magazine, vol. 53, no. 5, pp. 128–137, 2015.

[38] M. S. Elbamby, M. Bennis, W. Saad, M. Debbah, and M. Latva-Aho, “Resource optimiza-

tion and power allocation in in-band full duplex-enabled non-orthogonal multiple access

networks,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 12, pp. 2860–

2873, 2017.

119

[39] J. Du, E. Onaran, D. Chizhik, S. Venkatesan, and R. A. Valenzuela, “Gbps user rates using

mmWave relayed backhaul with high-gain antennas,” IEEE Journal on Selected Areas in


[40] E. Castaneda, A. Silva, A. Gameiro, and M. Kountouris, “An overview on resource alloca-

tion techniques for multi-user MIMO systems,” IEEE Communications Surveys & Tutori-

als, vol. 19, no. 1, pp. 239–284, 2017.

[41] S. Wagner, R. Couillet, M. Debbah, and D. Slock, “Large system analysis of linear precod-

ing in correlated MISO broadcast channels under limited feedback,” IEEE Transactions

on Information Theory, vol. 58, no. 7, pp. 4509–4537, 2012.

[42] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next

generation wireless systems,” IEEE Communications Magazine, vol. 52, no. 2, pp. 186–

195, 2014.

[43] L. Sanguinetti, A. Moustakas, and M. Debbah, “Interference management in 5G reverse

TDD HetNets: A large system analysis,” IEEE Journal on Selected Areas in Communica-

tions, vol. 33, pp. 1187–1200, 2015.

[44] J. Flordelis, F. Rusek, F. Tufvesson, E. G. Larsson, and O. Edfors, “Massive MIMO perfor-

mance TDD versus FDD: What do measurements say?” IEEE Transactions on Wireless


[45] N. Akbar, N. Yang, P. Sadeghi, and R. A. Kennedy, “Multi-cell multiuser massive MIMO

networks: User capacity analysis and pilot design,” IEEE Transactions on Communica-

tions, vol. 64, no. 12, pp. 5064–5077, 2016.

[46] X. Zhu, Z. Wang, L. Dai, and C. Qian, “Smart pilot assignment for massive MIMO,” IEEE

Communications Letters, vol. 19, no. 9, pp. 1644–1647, 2015.

[47] J.-C. Shen, J. Zhang, and K. B. Letaief, “Downlink user capacity of massive MIMO under

pilot contamination,” IEEE Transactions on Wireless Communications, vol. 14, no. 6, pp.

3183–3193, 2015.

[48] E. Björnson, E. G. Larsson, and M. Debbah, “Massive MIMO for maximal spectral effi-

ciency: How many users and pilots should be allocated?” IEEE Transactions on Wireless


[49] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S. Rappaport, and E. Erkip,

“Millimeter wave channel modeling and cellular capacity evaluation,” IEEE J. Sel. Areas

Commun., vol. 32, no. 6, pp. 1164–1179, Jun. 2014.

[50] A. L. Swindlehurst et al., “Millimeter-wave massive MIMO: The next wireless revolu-

tion?” IEEE Communications Magazine, vol. 52, no. 9, pp. 56–62, Sep. 2014.

[51] 3GPP, “Technical Specification Group Radio Access Network; Study on Integrated Access

and Backhaul for NR,” 3rd Generation Partnership Project (3GPP), Technical Specification

(TS) 38.874, 2018, rel-15.

120

[52] C. Dehos, J. L. González, A. De Domenico, D. Ktenas, and L. Dussopt, “Millimeter-wave

access and backhauling: the solution to the exponential data traffic increase in 5G mobile

communications systems?” IEEE Communications Magazine, vol. 52, no. 9, pp. 88–95,

2014.

[53] N. Omidvar, A. Liu, V. Lau, F. Zhang, and M. R. Pakravan, “Optimal hierarchical radio

resource management for hetnets with flexible backhaul,” IEEE Transactions on Wireless


[54] O. Tipmongkolsilp, S. Zaghloul, and A. Jukan, “The evolution of cellular backhaul tech-

nologies: Current issues and future trends,” IEEE Commun. Surveys & Tutorials, vol. 13,

no. 1, pp. 97–113, 2011.

[55] A. De La Oliva, X. C. Pérez, A. Azcorra, A. Di Giglio, F. Cavaliere, D. Tiegelbekkers,

J. Lessmann, T. Haustein, A. Mourad, and P. Iovanna, “Xhaul: toward an integrated

fronthaul/backhaul architecture in 5G networks,” IEEE Wireless Communications, vol. 22,

no. 5, pp. 32–40, 2015.

[56] M. Jaber, M. A. Imran, R. Tafazolli, and A. Tukmanov, “5G backhaul challenges and

emerging research directions: A survey,” IEEE Access, vol. 4, pp. 1743–1766, 2016.

[57] D. Tse and P. Viswanath, Fundamentals of wireless communication. Cambridge univer-

sity press, 2005.

[58] E. Dahlman, G. Mildh, S. Parkvall, J. Peisa, J. Sachs, Y. Selén, and J. Sköld, “5G wireless

access: requirements and realization,” IEEE Communications Magazine, vol. 52, no. 12,

pp. 42–47, 2014.

[59] P. Rost, A. Banchs, I. Berberana, M. Breitbach, M. Doll, H. Droste, C. Mannweiler, M. A.

Puente, K. Samdanis, and B. Sayadi, “Mobile network architecture evolution toward 5G,”

IEEE Communications Magazine, vol. 54, no. 5, pp. 84–91, 2016.

[60] I. Chih-Lin, S. Han, Z. Xu, S. Wang, Q. Sun, and Y. Chen, “New paradigm of 5G wireless

internet,” IEEE Journal on Selected Areas in Communications, vol. 34, no. 3, pp. 474–482,

2016.

[61] C. Perfecto, J. Del Ser, and M. Bennis, “Millimeter-wave V2V communications: Dis-

tributed association and beam alignment,” IEEE Journal on Selected Areas in Communi-

cations, vol. 35, no. 9, pp. 2148–2162, 2017.

[62] S. Samarakoon, M. Bennis, W. Saad, M. Debbah, and M. Latva-Aho, “Ultra dense small

cell networks: Turning density into energy efficiency,” IEEE Journal on Selected Areas in


[63] K. Son, S. Chong, and G. De Veciana, “Dynamic association for load balancing and in-

terference avoidance in multi-cell networks,” IEEE Transactions on Wireless Communica-

tions, vol. 8, no. 7, 2009.

[64] H. Kim, G. De Veciana, X. Yang, and M. Venkatachalam, “Distributed α-optimal user

association and cell load balancing in wireless networks,” IEEE/ACM Transactions on

Networking, vol. 20, no. 1, pp. 177–190, 2012.

121

[65] Q. Ye, B. Rong, Y. Chen, M. Al-Shalash, C. Caramanis, and J. G. Andrews, “User as-

sociation for load balancing in heterogeneous cellular networks,” IEEE Transactions on

Wireless Communications, vol. 12, no. 6, pp. 2706–2716, 2013.

[66] D. Bethanabhotla, O. Y. Bursalioglu, H. C. Papadopoulos, and G. Caire, “Optimal user-

cell association for massive MIMO wireless networks,” IEEE Transactions on Wireless


[67] J. Andrews, S. Singh, Q. Ye, X. Lin, and H. Dhillon, “An overview of load balancing in

HetNets: Old myths and open problems,” IEEE Wireless Communications, vol. 21, no. 2,

pp. 18–25, 2014.

[68] D. Liu, L. Wang, Y. Chen, M. Elkashlan, K. K. Wong, R. Schobe, and L. Hanzo, “User

association in 5G networks: A survey and an outlook,” IEEE Communications Surveys &

Tutorials, vol. 18, no. 2, pp. 1018–1044, 2016.

[69] S. Hur et al., “Millimeter wave beamforming for wireless backhaul and access in small

cell networks,” IEEE Transactions on Communications, vol. 61, no. 10, pp. 4391–4403,

Oct. 2013.

[70] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale

antenna arrays,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3, pp.

501–513, 2016.

[71] L. Zhao, D. W. K. Ng, and J. Yuan, “Multi-user precoding and channel estimation for

hybrid millimeter wave systems,” IEEE Journal on Selected Areas in Communications,

vol. 35, no. 7, pp. 1576–1590, 2017.

[72] T. K. Vu, C.-F. Liu, M. Bennis, M. Debbah, M. Latva-aho, and C. S. Hong, “Ultra-reliable

and low latency communication in mmWave-enabled massive MIMO networks,” IEEE

Communications Letters, vol. 21, no. 9, pp. 2041–2044, 2017.

[73] T. K. Vu, M. Bennis, M. Debbah, M. Latva-aho, and C. S. Hong, “Ultra-reliable com-

munication in 5G mmWave networks: A risk-sensitive approach,” IEEE Communications

Letters, vol. 22, no. 4, pp. 708–711, 2018.

[74] T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah, and M. Latva-aho, “Joint in-band back-

hauling and interference mitigation in 5G heterogeneous networks,” Proceedings - 22th

Eur. Wireless Conf., pp. 1–6, 2016.

[75] T. K. Vu, C.-F. Liu, M. Bennis, M. Debbah, and M. Latva-aho, “Path selection and rate

allocation in self-backhauled mmWave networks,” Proceedings - IEEE Wireless Commu-

nications and Networking Conference (WCNC), pp. 1–6, 2018.

[76] L. Georgiadis, M. J. Neely, and L. Tassiulas, “Resource allocation and cross-layer control

in wireless networks,” Foundations and Trends in Networking, vol. 1, no. 1, pp. 1–144,

2006.

[77] M. J. Neely, “Stochastic network optimization with application to communication and

queueing systems,” Synthesis Lectures on Commununication Networks, vol. 3, no. 1, pp.

1–211, 2010.

122

[78] A. Ben-Tal and A. Nemirovski, “On polyhedral approximations of the second-order cone,”

Mathematics of Operations Research, vol. 26, no. 2, pp. 193–205, 2001.

[79] R. Couillet and M. Debbah, Random matrix methods for wireless communications. Cam-

bridge University Press, 2011.

[80] S. Lasaulce and H. Tembine, Game theory and learning for wireless networks: Fundamen-

tals and applications. Academic Press, 2011.

[81] U. Ugurlu, T. Riihonen, and R. Wichman, “Optimized in-band full-duplex mimo relay

under single-stream transmission,” IEEE Transactions on Vehicular Technology, vol. 65,

no. 1, pp. 155–168, 2016.

[82] Z. Jun et al., “Large system analysis of cognitive radio network via partially-projected regu-

larized zero-forcing precoding,” IEEE Transactions on Wireless Communications, vol. 14,

no. 9, pp. 4934–4947, 2015.

[83] H. Boche, S. Naik, and M. Schubert, “Pareto boundary of utility sets for multiuser wireless

systems,” IEEE/ACM Transactions on Networking, vol. 19, no. 2, pp. 589–601, 2011.

[84] Z. Chen, S. Vorobyov, C. Wang, J. Thompson et al., “Pareto region characterization for

rate control in MIMO interference systems and Nash bargaining,” IEEE Transactions on

Automatic Control, vol. 57, no. 12, pp. 3203–3208, 2012.

[85] A. Beck, A. Ben-Tal, and L. Tetruashvili, “A sequential parametric convex approxima-

tion method with applications to nonconvex truss topology design problems,” Journal of

Global Optimization, vol. 47, no. 1, pp. 29–51, 2010.

[86] L. Tran, M. F. Hanif, A. Tölli, and M. Juntti, “Fast converging algorithm for weighted sum

rate maximization in multicell MISO downlink,” IEEE Signal Processing Letters, vol. 19,

no. 12, pp. 872–875, 2012.

[87] H. Li, L. Song, and M. Debbah, “Energy efficiency of large-scale multiple antenna systems

with transmit antenna selection,” IEEE Transactions on Communications, vol. 62, no. 2,

pp. 638–647, 2014.

[88] J. Löfberg, “YALMIP: A toolbox for modeling and optimization in MATLAB,” Proceed-

ings - IEEE International Symposium on Computer Aided Control Systems Design, pp.

284–289, 2004.

[89] K.-C. Toh, M. J. Todd, and R. H. Tütüncü, “SDPT3 - a MATLAB software package for

semidefinite programming, version 1.3,” Optimization Methods and Software, vol. 11, no.

1-4, pp. 545–581, 1999.

[90] A. MOSEK, “The MOSEK optimization toolbox for MATLAB manual, Version 7.1 (Re-

vision 28),” http://mosek. com,(accessed on March 20, 2015), 2015.

[91] J. Mo and J. Walrand, “Fair end-to-end window-based congestion control,” IEEE/ACM

Transactions on Networking, vol. 8, no. 5, pp. 556–567, 2000.

123

[92] A. Roivainen, C. F. Dias, N. Tervo, V. Hovinen, M. Sonkki, and M. Latva-aho, “Geometry-

based stochastic channel model for two-story lobby environment at 10 ghz,” IEEE Trans-

actions on Antennas and Propagation, vol. 64, no. 9, pp. 3990–4003, 2016.

[93] 3GPP, “Evolved Universal Terrestrial Radio Access (E-UTRA); Radio Frequency (RF)

system scenarios,” 3rd Generation Partnership Project (3GPP), Technical Specification

(TS) 36.942, 2014, rel-12.

[94] M. Bennis, S. M. Perlaza, P. Blasco, Z. Han, and H. V. Poor, “Self-organization in small

cell networks: A reinforcement learning approach,” IEEE Transactions on Wireless Com-

munications, vol. 12, no. 7, pp. 3202–3212, 2013.

[95] S. Singh, M. N. Kulkarni, A. Ghosh, and J. G. Andrews, “Tractable model for rate in

self-backhauled millimeter wave cellular networks,” IEEE Journal on Selected Areas in


[96] H. Shokri-Ghadikolaei and C. Fischione, “The transitional behavior of interference in mil-

limeter wave networks and its impact on medium access control,” IEEE Transactions on


[97] M. Rebato, M. Mezzavilla, S. Rangan, F. Boccardi, and M. Zorzi, “Understanding noise

and interference regimes in 5G millimeter-wave cellular networks,” Proceedings - 22th

European Wireless Conference, pp. 1–5, 2016.

[98] V. Petrov, M. Komarov, D. Moltchanov, J. M. Jornet, and Y. Koucheryavy, “Interference

and SINR in millimeter wave and terahertz communication systems with blocking and

directional antennas,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp.

1791–1808, 2017.

[99] A. Zhou, M. Liu, Z. Li, and E. Dutkiewicz, “Cross-layer design for proportional delay

differentiation and network utility maximization in multi-hop wireless networks,” IEEE

Transactions on Wireless Communications, vol. 11, no. 4, pp. 1446–1455, 2012.

[100] L. X. Bui, R. Srikant, and A. Stolyar, “A novel architecture for reduction of delay and

queueing structure complexity in the back-pressure algorithm,” IEEE/ACM Transactions

on Networking, vol. 19, no. 6, pp. 1597–1609, 2011.

[101] E. Stai and S. Papavassiliou, “User optimal throughput-delay trade-off in multihop net-

works under num framework,” IEEE Communications Letters, vol. 18, no. 11, pp. 1999–

2002, 2014.

[102] A. Zhou, M. Liu, Z. Li, and E. Dutkiewicz, “Joint traffic splitting, rate control, routing, and

scheduling algorithm for maximizing network utility in wireless mesh networks,” IEEE

Transactions on Vehicular Technology, vol. 65, no. 4, pp. 2688–2702, 2016.

[103] J. Garcia-Rois, F. Gomez-Cuba, M. R. Akdeniz, F. J. Gonzalez-Castano, J. C. Burguillo,

S. Rangan, and B. Lorenzo, “On the analysis of scheduling in dynamic duplex multi-

hop mmWave cellular systems,” IEEE Transactions on Wireless Communications, vol. 14,

no. 11, pp. 6028–6042, 2015.

124

[104] G. Narlikar, G. Wilfong, and L. Zhang, “Designing multihop wireless backhaul networks

with delay guarantees,” Wireless Networks, vol. 16, no. 1, pp. 237–254, 2010.

[105] D. Jurca and P. Frossard, “Media flow rate allocation in multipath networks,” IEEE Trans.

Multimedia, vol. 9, no. EPFL-ARTICLE-91033, pp. 1227–1240, 2007.

[106] S. Kompella, S. Mao, Y. T. Hou, and H. D. Sherali, “On path selection and rate allocation

for video in wireless mesh networks,” IEEE/ACM Transactions on Networking, vol. 17,

no. 1, pp. 212–224, 2009.

[107] E. Björnson, L. Sanguinetti, J. Hoydis, and M. Debbah, “Optimal design of energy-

efficient multi-user MIMO systems: Is massive MIMO the answer?” IEEE Transactions

on Wireless Communications, vol. 14, no. 6, pp. 3059–3075, 2015.

[108] B. Sahoo, C.-H. Yao, and H.-Y. Wei, “Millimeter-wave multi-hop wireless backhauling

for 5G cellular networks,” pp. 1–6, June 2017.

[109] G. Yang, M. Haenggi, and M. Xiao, “Traffic allocation for low-latency multi-hop

millimeter-wave networks with buffers,” IEEE Transactions on Communications, 2018.

[110] P. Key, L. Massoulié, and D. Towsley, “Path selection and multipath congestion control,”

Proceedings - 26th IEEE International Conference on Computer Communications (INFO-

COM), pp. 143–151, 2007.

[111] Z. Zhang, X. Chai, K. Long, A. V. Vasilakos, and L. Hanzo, “Full duplex techniques for

5G networks: Self-interference cancellation, protocol design, and relay selection,” IEEE


[112] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid

precoding for millimeter wave cellular systems,” IEEE Journal of Selected Topics in Signal

Processing, vol. 8, no. 5, pp. 831–846, 2014.

[113] D. H. Nguyen, L. B. Le, and T. Le-Ngoc, “Hybrid MMSE precoding for mmWave mul-

tiuser MIMO systems,” Proceedings - IEEE International Conference on Communications

(ICC), pp. 1–6, 2016.

[114] A. Alkhateeb, G. Leus, and R. W. Heath, “Limited feedback hybrid precoding for multi-

user millimeter wave systems,” IEEE Transactions on Wireless Communications, vol. 14,

no. 11, pp. 6481–6494, 2015.

[115] S. Singh, M. Geraseminko, S.-P. Yeh, N. Himayat, and S. Talwar, “Proportional fair traf-

fic splitting and aggregation in heterogeneous wireless networks,” IEEE Communications

Letters, vol. 20, no. 5, pp. 1010–1013, 2016.

[116] M. Giordani, M. Mezzavilla, S. Rangan, and M. Zorzi, “Multi-connectivity in 5G

mmWave cellular networks,” Proceedings - Mediterranean Ad Hoc Network Workshop

(Med-Hoc-Net), pp. 1–7, 2016.

[117] H. Shokri-Ghadikolaei, L. Gkatzikis, and C. Fischione, “Beam-searching and transmis-

sion scheduling in millimeter wave communications,” Proceedings - IEEE International

Conference on Communications (ICC), pp. 1292–1297, 2015.

125

[118] M. Hussain and N. Michelusi, “Energy-efficient interactive beam-alignment for millimeter-

wave networks,” IEEE Trans. Wireless Commun., vol. 18, no. 2, pp. 838–851, Feb. 2019.

[119] J. Palacios, D. De Donno, and J. Widmer, “Tracking mm-Wave channel dynamics: Fast

beam training strategies under mobility,” Proceedings - 36th Annual IEEE International

Conference on Computer Communications (INFOCOM), pp. 1–9, 2017.

[120] J. Liu and E. S. Bentley, “Hybrid-beamforming-based millimeter-wave cellular network

optimization,” Proceedings - 15th International Symposium on Modeling and Optimiza-

tion in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–8, 2017.

[121] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precod-

ing in millimeter wave MIMO systems,” IEEE Transactions on Wireless Communications,

vol. 13, no. 3, pp. 1499–1513, 2014.

[122] J. Wildman et al., “On the joint impact of beamwidth and orientation error on throughput in

directional wireless Poisson networks,” IEEE Transactions on Wireless Communications,

vol. 13, no. 12, pp. 7072–7085, 2014.

[123] T. Nitsche, C. Cordeiro, A. B. Flores, E. W. Knightly, E. Perahia, and J. C. Widmer, “IEEE

802.11 ad: directional 60 GHz communication for multi-Gigabit-per-second Wi-Fi,” IEEE


[124] T. Baykas, C.-S. Sum, Z. Lan, J. Wang, M. A. Rahman, H. Harada, and S. Kato, “IEEE

802.15. 3c: the first IEEE wireless standard for data rates over 1 Gb/s,” IEEE Communica-

tions Magazine, vol. 49, no. 7, 2011.

[125] J. D. Little and S. C. Graves, “Little’s law.” Springer, 2008, pp. 81–100.

[126] M. S. Elbamby, C. Perfecto, M. Bennis, and K. Doppler, “Toward low-latency and ultra-

reliable virtual reality,” IEEE Networks, vol. 32, no. 2, pp. 78–84, 2018.

[127] A. Mukherjee, “Queue-aware dynamic on/off switching of small cells in dense heteroge-

neous networks,” Proceedings - IEEE Global Communications Conference Workshops, pp.

182–187, Dec. 2013.

[128] S. M. Perlaza, H. Tembine, S. Lasaulce, and M. Debbah, “Quality-of-service provisioning

in decentralized networks: A satisfaction equilibrium approach,” IEEE Journal of Selected

Topics in Signal Processing, vol. 6, no. 2, pp. 104–116, 2012.

[129] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, “Backhaul-aware interference

management in the uplink of wireless small cell networks,” IEEE Transactions on Wireless


[130] S. Singh, T. Jaakkola, M. L. Littman, and C. Szepesvári, “Convergence results for single-

step on-policy reinforcement-learning algorithms,” Machine learning, vol. 38, no. 3, pp.

287–308, 2000.

[131] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.

126

[132] A. Ben-Tal and A. Nemirovski, Lectures on modern convex optimization: Analysis, algo-

rithms, and engineering applications. SIAM, 2001.

[133] K.-G. Nguyen, L.-N. Tran, O. Tervo, Q.-D. Vu, and M. Juntti, “Achieving energy efficiency

fairness in multicell MISO downlink,” IEEE Communications Letters, vol. 19, no. 8, pp.

1426–1429, 2015.

[134] A. Adhikary, E. Al Safadi, M. K. Samimi, R. Wang, G. Caire, T. S. Rappaport, and A. F.

Molisch, “Joint spatial division and multiplexing for mm-wave channels,” IEEE Journal

on Selected Areas in Communications, vol. 32, no. 6, pp. 1239–1255, 2014.

[135] T. L. Marzetta and B. M. Hochwald, “Fast transfer of channel state information in wireless

systems,” IEEE Transactions on Signal Processing, vol. 54, no. 4, pp. 1268–1278, 2006.

[136] T. Bai, V. Desai, and R. W. Heath, “Millimeter wave cellular channel models for system

evaluation,” Proceedings - IEEE International Conference on Computing, Networking and

Communications (ICNC), pp. 178–182, 2014.

[137] M. Weiner et al., “Design of a low-latency, high-reliability wireless communication sys-

tem for control applications,” Proceedings - IEEE International Conference on Communi-

cations (ICC), pp. 3829–3835, Jun. 2014.

[138] M. N. Kulkarni, E. Visotsky, and J. G. Andrews, “Correction factor for analysis of mimo

wireless networks with highly directional beamforming,” IEEE Wireless Communications

Letters, vol. 7, no. 5, pp. 756–759, 2018.

[139] T. Bai, R. Vaze, and R. Heath, “Analysis of blockage effects on urban cellular networks,”

IEEE Transactions on Wireless Communications, vol. 13, no. 9, pp. 5070–5083, 2014.

[140] E. Björnson, E. G. Larsson, and T. L. Marzetta, “Massive MIMO: Ten myths and one

critical question,” IEEE Communications Magazine, vol. 54, no. 2, pp. 114–123, 2016.

[141] T. H. A. Le and D. T. Pham, “The DC (difference of convex functions) programming and

DCA revisited with DC models of real world nonconvex optimization problems,” Annals

of Operations Research, vol. 133, no. 1, pp. 23–46, 2005.

[142] T. Lipp and S. Boyd, “Variations and extension of the convex–concave procedure,” Opti-

mization and Engineering, pp. 1–25, 2014.

[143] O. Mihatsch and R. Neuneier, “Risk-sensitive reinforcement learning,” Machine Learning,

vol. 49, no. 2-3, pp. 267–290, 2002.

[144] G. Yang, M. Xiao, and H. V. Poor, “Low-latency millimeter-wave communications: Traffic

dispersion or network densification?” IEEE Transactions on Communications, vol. 66,

no. 8, pp. 3526–3539, 2018.

127

Appendix 1 Proofs in chapter 3

1.1 Convergence analysis for Algorithm 3.1

Next, we establish a convergence result for Algorithm 3.1 based on the SCA method,

since the original problem (50) has a non-convex objective function (50a) subject to

non-convex constraint (50f). By using the SCA method, we replace the original non-

convex problem (50) by a strongly convex problem (55). We will briefly describe the

convergence here for the sake of completeness since it was studied in [85, 86]. We

assume that the Algorithm 3.1 obtains the solution of problem (55) at iteration i+ 1 th.

The updating rule in Algorithm 3.1 ensures that the optimal values ΛΛΛo(i), δ(i)ks , and ρ

(i)s

at iteration i satisfy all constraints in (55) and are feasible to the optimization problem

at iteration i+ 1. Hence, the objective obtained in the i+ 1st iteration is less than or

equal to that in the in the ith iteration, since we minimize the convex function. In

other words, Algorithm 3.1 yields a non-increasing sequence. Due to antenna and

interference constraints, the objective is bounded, and thus Algorithm 3.1 converges to

some local optimal solution of (55). Moreover, Algorithm 3.1 produces a sequence of

points that are feasible for the original problem (50) and this solution is satisfied the

KKT condition of the original problem (50) as discussed in [85, 86].

1.2 Performance analysis

Theorem 1.1 is provided to show the performance analysis of network utility maximiza-

tion based on Lyapunov framework and prove that the queues are stable.

Theorem 1.1. [Optimality] Assume that all queues are initially empty. For arbitrary

arrival rates, the operation mode and load balancing is chosen to satisfy (49) and the

rate regime. For a given constant χ ≥ 0, the network utility maximization with any

ν > 0 provides the following utility performance with χ − approximation

f0 ≥ f ∗0 − Ψ+ χ

ν,

where f ⋆0 is the optimal network utility over the rate regime.

Proof: To prove the Theorem 1.1, we first prove the queues are bounded. Let πk

denote the largest right derivative of f (rk), the Lyapunov framework can guarantee the

129

following strong stability of the virtual queues and the network queues.

Qk(t)≤ νωk(t)πk + 2amaxk , (116)

Yk(t)≤ νωk(t)πk + amaxk , (117)

Ds(t)≤ νωs(t)πs + amaxs+M. (118)

Here we first prove the bound of the virtual queues, and then the bound of the network

queues are proved similarly. Suppose that all queues are initially empty, this clearly

holds for t = 0. Suppose these inequalities hold for some t > 0, we need to show that it

also holds for t + 1.

From (45) and (47), if Yk(t) ≤ νωk(t)πk and Ds(t) ≤ νωs(t)πs then Yk(t + 1) ≤νωk(t)πk + amax

k and Ds(t + 1) ≤ νωs(t)πs + amaxs+M and the bound holds for t + 1 due

to the arrival rate constraint rk(t) ≤ amaxk and rs(t) ≤ amax

s . Else, if Yk(t) ≥ νωk(t)πk

and Ds(t) ≥ νωs(t)πs; since the value of auxiliary variables is determined by maxi-

mized ∑Kk=1 Yk(t)ϕk(t)+∑S

s=1 Ds(t)ϕs+M(t)−ν f0(ϕϕϕ(t)), ϕϕϕ(t) is then forced to be zero.

From (47) and (45), Yk(t+1) and Ds(t+1) are bounded by Yk(t) and Ds(t), respectively.

Since the virtual queues are bounded for t, we have the following inequalities

Yk(t + 1)≤ Yk(t)≤ νωk(t)πk + amaxk , (119)

Ds(t + 1)≤ Ds(t)≤ νωs(t)πs + amaxs+M. (120)

Hence, the bounds of the virtual queues hold for all t. Similarly, we show that the

network queue (116) holds for all t. It clearly holds for t = 0. We assume that (116)

holds for t > 0, we now prove it holds for t + 1. Note that from (44) and (47) we have

Qk(t + 1) ≤ Hk(t + 1)+ ak(t). Moreover, we just proved that Hk(t + 1) ≤ νωk(t)πk +

amaxk then we have Qk(t +1)≤ νωk(t)πk +2amax

k and the network bound holds for t +1.

We have established the network bounds, we are going to show the utility bound.

Since our solution of (46) is to minimize the Lyapunov drift and the objective function

every time slot t, we have the following inequality

∆(ΞΞΞ(t))−νE[ f0(ϕϕϕ(t))]≤

Ψ−νE[ f0(ϕϕϕ∗(t))]+∑K

k=1 Qk(t)E[

ak(t)− r∗k(t)|ΞΞΞ(t)]

+∑Kk=1 Yk(t)E

[

ϕ∗k (t)− r∗k(t)|ΞΞΞ(t)

]

+∑Ss=1 Ds(t)E

[

ϕ∗s+M(t)−φ (bs)∗(t)rcs∗

s (t)|ΞΞΞ(t)]

,

130

where ϕϕϕ∗(t),φ (bs)∗(t), and r∗k (t) are the optimal values of the problem (49). Since the

queues are bounded, for given χ ≥ 0, obtaining

∆(ΞΞΞ(t))−νE[ f0(ϕϕϕ(t))]≤ Ψ−νE[ f0(ϕϕϕ∗(t))]+ χ .

By taking expectations of both sides of the above inequality and choosing r∗(t) = ϕϕϕ∗(t),

it yields for all t ≥ 0,

E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(t))|ΞΞΞ(t)

]−νE[ f0(ϕϕϕ(t))]≤

Ψ+ χ −νE[ f0(r∗(t))].

By taking the sum over τ = 0, . . . , t−1 and dividing by t, (using the fact that f0(r∗(t)) =

f ∗0 ), yielding

E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(0))|ΞΞΞ(t)

]

t− ν

t

t−1

∑τ=0

E[ f0(ϕϕϕ(t))]≤

Ψ+ χ −ν f ∗0 .

(121)

By using the fact that L(ΞΞΞ(t+1))≥ 0 and L(ΞΞΞ(0))= 0, and applying Jensen’s inequality

in the concave function and rearranging term yields

f0(ϕϕϕ(t))≥ f ∗0 − Ψ+ χ

ν.

Since the network utility function is a non-decreasing concave function, the auxiliary

variable is chosen to satisfy rk(t) ≥ ϕk(t). Hence f0(r(t)) ≥ f0(ϕϕϕ(t)) ≥ f ∗0 − Ψ+χν ,

which means that the solution is closed to the optimal as increasing ν . Which com-

pletes the proof of the Theorem 1.1. Hence, there exists an [O(ν),O(1/ν)] utility-

queue length tradeoff, which leads to an utility-delay balancing.

We now prove that all queues are stable by using the Definition 2.2, the bound (121)

can be rewritten as

∆(ΞΞΞ(t))≤ C,

where C is any constant that satisfies for all t and ΞΞΞ(t): C ≥Ψ+χ−ν( f ∗0 −E[ f0(ϕϕϕ(t))]).

By using the definition of the Lyapunov drift and taking an expectation, obtaining

E[L(ΞΞΞ(t))

]≤ Ct.

As the definition of the Lyapunov function L(ΞΞΞ(t)) we have

E[Qk(t)]2,E[Hk(t)]

2,E[Ds(t)]2 ≤ 2Ct.

131

Dividing both sides by t2, and taking the square roots shows for all t > 0:

E[Qk(t)]

t,E[Hk(t)]

t,E[Dk(t)]

t≤√

2C

t.

As t → ∞, taking the limit, we prove the queues are stable.

132


Book orders:Granum: Virtual book storehttp://granum.uta.fi/granum/

S E R I E S C T E C H N I C A

686. Silvola, Risto (2018) One product data for integrated business processes

687. Hildebrandt, Nils Christoph (2018) Paper-based composites via the partialdissolution route with NaOH/urea

688. El Assal, Zouhair (2018) Synthesis and characterization of catalysts for the totaloxidation of chlorinated volatile organic compounds

689. Akanegbu, Justice Orazulukwe (2018) Development of a precipitation index-based conceptual model to overcome sparse data barriers in runoff prediction incold climate

690. Niva, Laura (2018) Self-optimizing control of oxy-combustion in circulatingfluidized bed boilers

691. Alavesa, Paula (2018) Playful appropriations of hybrid space : combining virtualand physical environments in urban pervasive games

692. Sethi, Jatin (2018) Cellulose nanopapers with improved preparation time,mechanical properties, and water resistance

693. Sanguanpuak, Tachporn (2019) Radio resource sharing with edge caching formulti-operator in large cellular networks

694. Hintikka, Mikko (2019) Integrated CMOS receiver techniques for sub-ns basedpulsed time-of-flight laser rangefinding

695. Järvenpää, Antti (2019) Microstructures, mechanical stability and strength of low-temperature reversion-treated AISI 301LN stainless steel under monotonic anddynamic loading

696. Klakegg, Simon (2019) Enabling awareness in nursing homes with mobile healthtechnologies

697. Goldmann Valdés, Werner Marcelo (2019) Valorization of pine kraft lignin byfractionation and partial depolymerization

698. Mekonnen, Tenager (2019) Efficient resource management in Multimedia Internetof Things

699. Liu, Xin (2019) Human motion detection and gesture recognition using computervision methods

700. Varghese, Jobin (2019) MoO3, PZ29 and TiO2 based ultra-low fabricationtemperature glass-ceramics for future microelectronic devices

701. Koivupalo, Maarit (2019) Health and safety management in a global steel companyand in shared workplaces : Case description and development needs

C703etukansi.fm Page 2 Friday, April 5, 2019 10:48 AM

UNIVERSITY OF OULU P .O. Box 8000 F I -90014 UNIVERSITY OF OULU FINLAND


University Lecturer Tuomo Glumoff

University Lecturer Santeri Palviainen

Senior research fellow Jari Juuti


University Lecturer Veli-Matti Ulvinen

Planning Director Pertti Tikkanen

Professor Jari Juga

University Lecturer Anu Soikkeli


Publications Editor Kirsti Nurkkala

ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)


TECHNICA


TECHNICA

OULU 2019

C 703

Kien Vu


UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU,FACULTY OF INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING;CENTRE FOR WIRELESS COMMUNICATIONS

C 703

AC

TAK

ien VuC703etukansi.fm Page 1 Friday, April 5, 2019 10:48 AM

c 703 acta - jultika.oulu.fijultika.oulu.fi/files/isbn9789526222431.pdf · i am also thankful to...

Documents