c 703 acta - jultika.oulu.fijultika.oulu.fi/files/isbn9789526222431.pdf · i am also thankful to...
TRANSCRIPT
UNIVERSITY OF OULU P .O. Box 8000 F I -90014 UNIVERSITY OF OULU FINLAND
A C T A U N I V E R S I T A T I S O U L U E N S I S
University Lecturer Tuomo Glumoff
University Lecturer Santeri Palviainen
Senior research fellow Jari Juuti
Professor Olli Vuolteenaho
University Lecturer Veli-Matti Ulvinen
Planning Director Pertti Tikkanen
Professor Jari Juga
University Lecturer Anu Soikkeli
Professor Olli Vuolteenaho
Publications Editor Kirsti Nurkkala
ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)
U N I V E R S I TAT I S O U L U E N S I SACTAC
TECHNICA
U N I V E R S I TAT I S O U L U E N S I SACTAC
TECHNICA
OULU 2019
C 703
Kien Vu
INTEGRATED ACCESS-BACKHAUL FOR 5G WIRELESS NETWORKS
UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU,FACULTY OF INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING;CENTRE FOR WIRELESS COMMUNICATIONS
C 703
AC
TAK
ien Vu
C703etukansi.fm Page 1 Friday, April 5, 2019 10:48 AM
ACTA UNIVERS ITAT I S OULUENS I SC Te c h n i c a 7 0 3
KIEN VU
INTEGRATED ACCESS-BACKHAUL FOR 5G WIRELESS NETWORKS
Academic dissertation to be presented with the assent ofthe Doctoral Training Committee of InformationTechnology and Electrical Engineering of the University ofOulu for public defence in the OP auditorium (L10),Linnanmaa, on 13 May 2019, at 12 noon
UNIVERSITY OF OULU, OULU 2019
Copyright © 2019Acta Univ. Oul. C 703, 2019
Supervised byProfessor Matti Latva-ahoAssociate Professor Mehdi Bennis
Reviewed byProfessor Petar PopovskiAssociate Professor Ming Xiao
ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)
ISSN 0355-3213 (Printed)ISSN 1796-2226 (Online)
Cover DesignRaimo Ahonen
JUVENES PRINTTAMPERE 2019
OpponentProfessor Risto Wichman
Vu, Kien, Integrated access-backhaul for 5G wireless networks. University of Oulu Graduate School; University of Oulu, Faculty of Information Technologyand Electrical Engineering; Centre for Wireless CommunicationsActa Univ. Oul. C 703, 2019University of Oulu, P.O. Box 8000, FI-90014 University of Oulu, Finland
Abstract
With the unprecedented growth in mobile data traffic and network densification, the emergingfifth-generation (5G) wireless network warrants a paradigm shift with respect to system designand technological enablers. In this regard, the prime motivation of this thesis is to propose anintegrated access-backhaul (IAB) framework to dynamically schedule users, while efficientlyproviding a wireless backhaul to dense small cells and mitigating interference. In addition, jointresource allocation and interference mitigation solutions are proposed for two-hop and multi-hopself-backhauled millimeter wave (mmWave) networks.
The first contribution of this thesis focuses on a multi-user two-hop relay cellular system inwhich a massive antenna array enabled macro base station (BS) simultaneously provides highbeamforming gains to outdoor users, and wireless backhauling to outdoor small cells. Moreover,a hierarchical interference mitigation scheme is applied to efficiently mitigate cross-tier and co-tier interference.
In the second contribution, a multi-hop self-backhauled mmWave communication scenario isstudied whereby a joint multi-hop multi-path selection and rate allocation framework is proposedto enable Gbps data rates with reliable communications. Using reinforcement learning techniques,a dynamic and efficient re-routing solution is proposed to cope with blockage and latencyconstraints. Finally, a risk-sensitive learning solution is leveraged to provide high-reliability andlow-latency communications.
In summary, the dissertation analyses key trade-offs between (i) capacity and latency, (ii)reliability and network density. Extensive simulation results were carried out to verify theperformance gains of the proposed algorithms compared to several baselines and for differentnetwork settings. Key findings show significant improvements in terms of higher data rates, lowerlatency, and reliable communications with some trade-offs.
Keywords: 5G, integrated access and backhaul, latency, massive MIMO, mmWavecommunications, reliability, ultra-dense networks
Vu, Kien, Integroitu liityntä- ja runkoverkkoyhteys langattomiin 5G-verkkoihin. Oulun yliopiston tutkijakoulu; Oulun yliopisto, Tieto- ja sähkötekniikan tiedekunta; Centre forWireless CommunicationsActa Univ. Oul. C 703, 2019Oulun yliopisto, PL 8000, 90014 Oulun yliopisto
Tiivistelmä
Liikkuvan dataliikenteen ennennäkemättömän kasvun ja verkkojen tihentymisen seurauksenapian käyttöön tulevien viidennen sukupolven (5G) langattomien verkkojen järjestelmäsuunnitte-lua ja teknologisten mahdollistajien käyttöä on täytynyt lähestyä kokonaan uudesta näkökulmas-ta. Niinpä tämän väitöstyön johtavana ajatuksena on ehdottaa integroitua verkkoon pääsyn jarunkoverkkoyhteyden muodostamismallia, jossa käyttäjät resursoidaan dynaamisesti ja samallamuodostetaan tehokkaat runkoverkkoyhteydet piensoluille. Tätä varten tutkitaan resurssiallokaa-tion ja häiriöiden lieventämisen yhteisratkaisuja, jotka tukevat kahden tai useamman hypynyhteyksiä ja samanaikaista runkoverkkoyhteyden luomista millimetriaaltoalueen verkoissa.
Työn alkuosa keskittyy usean käyttäjän välitinavusteiseen kahden hypyn solukkoverkkoon,jossa makrotukiasemassa käytetään suurta antenniryhmää muodostamaan samanaikaisesti suurenvahvistuksen antennikeiloja käyttäjälinkeille ja langattomalle runkoyhteysosuudelle. Lisäksisovelletaan hierarkkista häiriönvaimennusmenetelmää saman kerroksen ja kerrosten välisen häi-riön tehokkaaseen vähentämiseen.
Työn seuraavassa osassa arvioidaan usean hypyn runkoverkkoyhteyden muodostuksen tutki-musongelmaa millimetrialueen kommunikaatiossa kehittämällä yhdistetty menetelmä useanhypyn monipolkuvalinnalle ja tiedonsiirtoresurssien allokoinnille. Tällä tähdätään gigabittiluo-kan datanopeuksiin ja luotettavaan tietoliikenteeseen millimetrialueella. Vahvistavan oppimisentekniikan avulla esitellään dynaaminen ja tehokas uudelleenreitityskonsepti toimimaan esto- javiiverajoitusten kanssa. Lopuksi hyödynnetään riskisensitiivistä oppimista ja antennidiversiteet-titekniikoita suuren luotettavuuden ja pienen latenssin saavuttamiseksi millimetrialueen tiedon-siirrossa.
Näiden avulla analysoidaan kaupankäyntiä esimerkiksi (i) kapasiteetin ja latenssin sekä (ii)luotettavuuden ja verkon tiheyden/kuormituksen välillä. Mittavien suoritettujen simulointienavulla osoitetaan ehdotettujen algoritmien suorituskykyedut suhteessa tunnettuihin verrokkeihinuseissa eri skenaarioissa. Tulosten perusteella saavutetaan merkittäviä kustannussäästöjä infra-struktuurin ja runkoverkon osalta sekä päästään suuriin datanopeuksiin ja parannuksiin pienenlatenssin luotettavassa tietoliikenteessä.
Asiasanat: integroitu verkkoon pääsy ja runkoverkkoyhteys, keilanmuodostuksensuunnittelu, massiivinen MIMO, millimetriaaltoalueen tietoliikenne, ultratiheäpiensoluverkko
Dedicated to my friends
and to my family
8
Preface
This work was carried out at the Centre for Wireless Communications (CWC) and the
Faculty of Information Technology and Electrical Engineering (ITEE) at the University
of Oulu, in Finland from November 2014. However, this work would not have been
possible without the encouragement, help, and guidance that I received over the years
from many individuals.
First, I would like to express my sincerest gratitude to my supervisors, Professor
Matti Latva-aho and Associate Professor Mehdi Bennis, for providing the opportunity
to pursue my doctoral studies. I greatly appreciate their vast knowledge and inspiring
ideas, which cover a multitude of areas as well as their continual support, guidance,
and encouragement throughout my postgraduate research and studies. Furthermore, I
would like to thank Professor Mérouane Debbah from Huawei R&D in France for his
valuable support and comments on my work. It was an honour to carry out my doctoral
research under his guidance.
I would also like to thank my follow-up group, Professor Markku Juntti, and Ad-
junct Professor Pekka Pirinen from the University of Oulu for their insightful advice
and discussions during my doctoral studies. I further wish to express my gratitude to
the pre-examiners of this thesis, Professor Petar Popovski from Aallborg University, in
Denmark, and Associate Professor Ming Xiao from KTH, Royal Institute of Technol-
ogy, in Sweden for their constructive comments, and Professor Risto Wichman from
the Aalto University, in Finland, for acting as prestigious opponent in my doctoral de-
fence. I would also like to thank Dr. Le-Nam Tran from University College Dublin and
Dr. Vinh Phan from Nokia Bell Labs for their invaluable advice and supports during
my doctoral studies. Furthermore, I would like to thank Professor ZhiSheng Niu for
hosting my research visit at Tsinghua University, and would also like to thank Chen
Sheng for his company and support. Importantly, I would like to thank all the editors
and all anonymous reviewers for their constructive comments on the work.
This thesis was financially supported in part by the Academy of Finland 6Genesis
Flagship project (grant 318927). While conducting the research, I was able to work
on several projects. The work has been supported by the Academy of Finland un-
der the project 5Gto10G, the Software Defined Hyper Cellular Architecture project for
Green and Smart Service Provisioning in 5G Networks (HYPER5G), the Higher Fre-
9
quency 5G Communications (High5) project, the Academy of Finland funding via grant
307492 and the CARMA grants 294128 and 289611, and the project SMARTER. I have
also been fortunate to receive personal grants from the Nokia Foundation, the Finnish
Foundation for Technology Promotion, the Riitta and Jorma J. Takanen Foundation, the
UNIOGS travel grant, and the Tauno Tönning Foundation. All of these funders are
highly appreciated.
Alone we can do so little; but together we can do so much. I would like to thank
our research team Sumudu, Chen-Feng, Elbamby, Cristina, Petri, Mohamed, Jihong,
Hamza, Hamid, Anis, and Mounssif for the countless productive discussions and meet-
ings. My special thanks aslo go to Sumudu for his great help all the time. He always
provided nice discussions and made good suggestions for the research and other practi-
cal things. Chen-Feng, Cristina, and Elbamby were also very supportive in any practical
issues at work, and we also had nice after work conversations. The countless friendly
discussions on professional as well as personal matters with them allowed me to stay
focused and high-spirited, for which I am truly grateful. Life without coffee is impos-
sible, and I would like to thank Elbamby for his company and unfailing support. I
would also like to thank Giang, Satya, and Doanh who provided a lot of valuable advice
regarding my research problem. I would like to thank Ayotunde and Manosa for all
the moments we have shared in the office TS 414. Further, I would like to thank all
my CWC colleagues for maintaining an inspiring and supportive working environment.
They are many just to name but include Parisa, Iran, Jiquang, Qiang, Moiz, Samad,
Inosha, Makus, Oskari, Nuutti, Jari Marjakangas, Hamidreza, Tachporn, Mojtaba and
many others. I also would like to thank the administrative staff from CWC and UoU,
including Kirsi Ojutkangas, Jari Sillanpää, Eija Pajunen and Anu Niskanen for their
unfailing support and assistance.
I am also thankful to the Vietnamese community: Thang, Khanh&Ha, Nhat, Kien
Ngo, Lan, Ha, Vu, Minh Thuy, Thao Pham, Thao Duong, Hoang, Tam, Dung, Linh, Dat,
Phong, Tri, Lam&Linh, Hong, and many others for their friendships and memorable
moments and the rest who are and were here during past couple of years. Life in
Oulu would not have been as wonderful without them. My special thanks also go to
JP and Mai, Tai for their friendships and for making our gatherings so memorable. I
would like to thank the family of Sxu, Phuong, and Tara, and Mr. Xuan Bao for their
understanding, friendship, and for helping me stay sane through these difficult times
since the day I came to Finland. Without their advice and support, my life probably
would have proceeded down the wrong path, and I am so grateful for all that Phuong
10
and Sxu have done. I also would like to thank the family of Jussi and Miia for their
friendships and encouragements. Special thanks also go to my friends far away in
Vietnam: Anh Tuan, Hai, Hoang Anh, Nam, Chung, Nguyen Tuan, Binh, Trang, Thuc,
Hiep, Thang, Tran Dung and in South Korea: Khanh, Duc, Quoc Hoan, Ngoc Hoan,
Anh Tuan, Minh Luan and many others.
Last and definitely not least, I would not be standing here without the endless love,
support, and inspiration from all my relatives Binh Minh, Huu Khanh, Phuong Le, Mai
Huong, Khanh Huong and my big family. I would like to thank my parents, my sister,
my nephew, and my girlfriend for their love and caring, and for being the ultimate
courage and strength of my life.
Oulu, November 2018.
Kien Vu
11
12
List of abbreviations
Acronyms:
5G 5th generation
BG Boltzmann-Gibbs
BS Base station
CA Closed access
CCDF Complementary cumulative density function
CCP Convex-concave procedure
CDF Cumulative density function
CSI Channel state information
DC Difference of convex function
DL Downlink
DPP Drift-plus-penalty
FD Full-duplex
HA Hybrid access
HD Half-duplex
HetNet Heterogeneous network
HomNet Homogeneous network
INR Interference and noise ratio
ISD Inter-site distance
KKT Karush-Kuhn-Tucker
LOS Line-of-sight
MBS Macro cell base station
MIMO Multiple-input multiple-output
MISO Multiple-input single-output
mmWave millimeter wave
MUE Macro cell user equipment
NLOS Non line-of-sight
NUM Network utility maximization
OA Open access
PF Proportional fair
QoS Quality-of-service
13
QSI Queue state information
RAN Radio access network
RHS Right-hand side
RL Reinforcement learning
RMT Random matrix theory
RSL Risk-sensitive reinforcement learning
RZF Regularized zero-forcing
SC Small cell base station
SCA Successive convex approximation
SIC Self-interference cancellation
SINR Signal to interference and noise ratio
SNR Signal to noise ratio
SOCP Second-order cone programming
SUE Small cell user equipment
TDD Time division duplexing
TNU Total network utility
UDN Ultra-dense network
UE User equipment
UL Uplink
URC Ultra-reliable communication
URLLC Ultra-reliable low latency communication
UT User throughput
WSRM Weighted sum rate maximization
ZF Zero-forcing
Roman-letter notations:
am Data arrival destined for UE m
am Mean arrival rate at UE m
amaxm Maximum data arrival destined for UE m
B Number of all base stations
H(b) Channel matrix between all UEs and the BS b in chapter 3
h(b)m Channel vector between the mth UE and the BS b in chapter 3
h(b,n)m Channel between the mth MUE and the nth antenna of BS b in chapter
3
14
H(b),M Estimate of channel matrix H(b),M in chapter 3
h(bs)u Channel vector between the uth UE and the SC bs in chapter 3
H(i, j) Channel matrix between transmitter i and receiver j in chapter 4
H(i, j) Estimate of channel matrix between transmitter i and receiver j in chap-
ter 4
Hm Channel matrix between the MBS and a UE m in chapter 5
Hm Estimate of channel matrix between the MBS and a UE m in chapter 5
Hbk Channel matrix between BS b and UE k in chapter 6
Hbk Estimate of channel matrix between BS b and UE k in chapter 6
K Number of user equipments
l Load balancing variable vector
l(bs)cs Transmission association indicator from SC bs to SUE cs
l(bs)m Transmission association indicator from BS bs ∈ B to UE m
l(b0)s+M Transmission association indicator from MBS b0 to SC s
M Number of macro user equipments
N Number of antennas at the MBS
Ns Number of antennas at the SC s
Naus Number of active users at SC s
Ntxs Total number of transmissions at SC
P Transmit power matrix
p Transmit power allocation vector
P(b0)orP Maximum transmit power at the MBS
p(b0)m DL MBS transmit power assigned to UE m
p(b0)s+M DL MBS transmit power assigned to SC s
Q Network queue at the MBS
Qm Network queue backlog for UE m
r Ergodic data rate vector of all UEs
r Time average expectation of the Ergodic data rate vector of all UEs
r(b0)m Ergodic data rate at the mth UE from the BS b
S Number of small cell base stations
T Co-tier interference mitigation precoding matrix at the MBS
U Cross-tier interference mitigation precoding matrix at the MBS
V Precoding matrix at the MBS
vm Precoding vector of UE m
w(b0)m Real small-scale fading channel matrix
15
w(b0)m Estimate of the small-scale fading matrix
x(b)m Signal symbol at the mth MUE from the BS b
Y Virtual queue vector for auxiliary variables
y(b0)m Received signal at the mth UE from the BS b
h(bs)u Received signal at the uth UE from the SC bs
z(b0)m Small-scale fading channel noise matrix at UE m
Mathcal-style notations:
B Set of all base stations
F Set of data flows
K Number of single-antenna UEs
L Set of all directional edges
M Number of macro UEs
N Set of all nodes
N(o)
i Set of all nodes
N(o)
i Set of the next hops from node i
R Average rate region
S Set of small cell base stations
Z f Set of disjoint paths observed by flow f
Greek-letter notations:
α Network control action
β Network random event
χ Approximation factor
δ Slack variable for SCA method in chapter 3
ε Reliable target
ηm Thermal noise at UE m
ι Learning rate
κ Learning temperature rate
µ Risk-sensitive factor
ν Lyapunov control parameter
ωm Weight of user m
16
φ Operation mode to control the FD-enabled SC transmission
π Probability of choosing an action
Φ Regret rate
ΦΦΦ Regret rate vector
σ2 Thermal noise covariance
τm Channel estimate error of UE m
ΘΘΘ Spatial channel correlation matrix
θ Beamwidth
ϕ Auxiliary variable
ζ Regularized zero forcing parameter
∆(·) Lyapunov drift function
ΛΛΛ Composite control variable
ΛΛΛo Composite control variable of load balancing and operation mode
ΩΩΩ Solution of the Stieltjes transformation
ΨΨΨ Lyapunov independent constant
ΞΞΞ Queue backlog vectors
ℵ Flow utility in chapter 4
k0 Allowed FD INR threshold
k(bs)m FD INR from FD-enabled SC bs to UE m
Mathematical operator notations and symbols:
|X | Cardinality of the set X
diag(xxx) Diagonal matrix with xxx as the diagonal
‖xxx‖ Euclidean norm of vector xxx
E[·] Expectation function
1x(X ) Indicator function, i.e. returns 1 if x ∈ X , 0 otherwise
Pr(·) Probability of the event
x+ Returns x if x > 0 and 0 otherwise
Z Set of integers
R Set of real numbers
(·)⋆ Solution of an optimization problem
XT Transpose of matrix X
X† Hermitian of matrix X
Rank(X) Rank of matrix X
17
18
Contents
Abstract
Tiivistelmä
Preface 9
List of abbreviations 13
Contents 19
1 Introduction 23
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.2 5G technologies for mobile broadband . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.2.1 Ultra-dense small cell networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.2.2 Massive MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.2.3 Millimeter wave communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.3 Scope of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Author’s contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
2 Research methodologies 31
2.1 Stochastic optimization for wireless networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.1 Queuing networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.1.2 Auxiliary variables and virtual queues introduction . . . . . . . . . . . . . . . 32
2.1.3 Lyapunov optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.2 A successive convex approximation technique . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Random matrix theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4 Reinforcement learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 Integrated access and backhaul architecture 41
3.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Load balancing and interference mitigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.1 Downlink transmission . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3.2 Joint load balancing and interference mitigation . . . . . . . . . . . . . . . . . . 48
3.3.3 Closed-form expression via a deterministic equivalent . . . . . . . . . . . . . 50
3.4 Proposed load balancing and interference mitigation . . . . . . . . . . . . . . . . . . . . . 52
3.4.1 Joint load balancing and operation mode selection . . . . . . . . . . . . . . . . 54
3.4.2 Auxiliary variable optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
19
3.4.3 Interference mitigation and power allocation . . . . . . . . . . . . . . . . . . . . . 57
3.4.4 Queue update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.1 Simulation environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.2 Ultra-dense small cells environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5.3 Wireless backhaul impact for different transmit power levels . . . . . . . 62
3.5.4 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4 Self-backhauled multi-hop architecture 67
4.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2.1 Network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70
4.2.2 mmWave MIMO channel model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2.3 Transmission rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72
4.2.4 Network queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.4 Proposed path selection and rate allocation algorithm . . . . . . . . . . . . . . . . . . . . 76
4.4.1 Path selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.4.2 Rate allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.5.1 Small antenna array system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.5.2 Large antenna array system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.5.3 Convergence characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5 Low-latency communication in massive MIMO wireless networks 93
5.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.4 Proposed control parameter selection and power allocation . . . . . . . . . . . . . . . 96
5.4.1 Control parameters selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .98
5.4.2 Power allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.5.1 Impact of the arrival rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.5.2 Impact of user density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
20
6 Ultra-reliable communication in 5G mmWave networks 103
6.1 Main contributions and related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 System model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.4 Proposed distributed learning algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.6 Summary and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7 Conclusions and future work 113
7.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
References 117Appendices 129
21
22
1 Introduction
1.1 Motivation
The unprecedented growth in data traffic, driven by the massive number of connected
wireless devices (e.g., mobile phones, laptops, sensing devices) and rich content ap-
plications (e.g., video and game streaming, augmented and virtual reality), is posing
unprecedented challenges in terms of extreme data rates, low latency, high reliability,
and scalability. The fifth generation (5G) wireless systems are expected to meet these
challenges, which require a paradigm shift in system design and radio technologies.
According to the international telecommunication union (ITU), 5G encompasses three
service categories: enhanced mobile broadband (eMBB), ultra-reliable and low latency
communication (URLLC), massive machine-type communication (mMTC) [1, 2]. In
particular, eMBB aims at providing users with high peak data rates, and moderate rates
for cell-edge users; URLLC supports low-latency transmissions with very high relia-
bility [3, 4, 5, 6]; mMTC supports a massive number of IoT devices [7, 8]. In this
regard, both academia and industry have paid tremendous attention to the underutilised
mmWave frequency bands (30− 300 GHz) due to the current scarcity of the wireless
spectrum [9, 10, 11, 12, 13]. Meeting these traffic demands can be achieved by (i)
advanced spectrally-efficient techniques, e.g., massive multiple-input multiple-output
(MIMO) [14, 15, 16]; and (ii) ultra-dense self-backhauled small cell (SC) deploy-
ments [17, 18, 19, 20, 21, 22]. Indeed, massive MIMO is instrumental in leveraging
mmWave frequency bands and providing wireless backhauls and access in ultra-dense
network deployments. Furthermore, network densification is a promising technique
to boost capacity and extend coverage by reducing the communication range between
users and base stations.
This thesis examines three 5G enablers, namely mmWave communications, mas-
sive MIMO and ultra-dense small cells in which the goal is to design and optimize an
integrated access and backhaul deployment. A potential use case of the thesis is aug-
mented and virtual reality that requires extreme data rates and very low latency. In
general, end-to-end latency can be defined as the time taken for a packet to be gen-
erated in a protocol layer at the source through the network to the same layer at the
destination, which includes the over-the-air transmission delay, propagation delay, pro-
cessing/computing delay, retransmission, and queuing delay [5, 6]. Reliability can be
23
MBSFD-SC
MUE
Massive MIMO Antennas
D: Queue buffer
Q: Network Queue
Data
FD-SC
SUEwireless
backhaul
dataaccess
MUE served by either MBS
or nearby FD-SCs
SUE served by SC only
and interfered by MBS
MUEMUE
MUE
SUE
SUE
interfering signal
useful signal
FD-SC
Fig. 1. Integrated access and backhaul architecture for the considered 5G network scenario
([23] c©2017 IEEE).
defined as the probability that for a given deadline a packet is successfully received at
the destination. This thesis focuses on the downlink (DL) transmission and the queue-
ing delay, and addresses the following fundamental questions:
– Q1: How can ultra-dense SCs be deployed to serve a large density of UEs in a multi-
user two-hop relay IAB scenario as shown in Fig. 1 ?
– Q2: How should paths be selected and transmission rates be allocated in multi-hop
multi-path self-backhauled mmWave networks as shown in Fig. 2 ?
– Q3: How can low latency communication be enabled for outdoor UEs with eMBB
services in massive MIMO wireless systems ?
– Q4: How can ultra-reliable communication be provided in ultra-dense SC networks
in the presence of risk and uncertainty ?
1.2 5G technologies for mobile broadband
This section briefly introduces some of the main concepts of ultra-dense small cells,
massive MIMO and millimeter-Wave communications, which are relevant for the scope
of the thesis.
24
Macro BS
Self-backhauled SCBS
UE 1
UE 2
Traffic aggregation
Route 1
Route 2
Route 4
Route 3
Traffic split
Full-duplex communication
UE K.....
UE k
One - hop transmission range
Fig. 2. Illustration of 5G multi-hop self-backhauled mmWave networks ([24] c©2018 IEEE).
1.2.1 Ultra-dense small cell networks
In order to boost network capacity and expand coverage, the concept of deploying
low-cost, low-power SCs over traditional macro cell networks has been investigated
[18, 19, 21, 20]. Dense SC deployment brings users closer to the base stations result-
ing in improved wireless connectivity. With multiple-antenna arrays at the SCs, hybrid
beamforming can be leveraged to achieve higher transmission gains, reduce the trans-
mitting power, and mitigate interference between co-tier users. In addition, by exchang-
ing statistic channel information between base stations, cross-layer interference can be
reduced by a proper hierarchical beamforming design [25] or through cooperative in-
terference avoidance/managament schemes [26, 27, 28, 29]. Furthermore, the concept
of cache-enabled SCs was introduced to reduce the backhaul load and improve user
experience in [30].
Recent advances in full-duplex (FD) communication offer the potential to double ca-
pacity and lower latency in which in-band FD-enabled SCs relay data from a macro BS
to UEs in the same frequency band [31, 32, 33]. As a result, FD enables ultra-dense SC
deployments by using wireless backhaul [34, 35, 36, 37]. In particular, with FD com-
munication, SCs transmit and receive signals at the same time using self-interference
cancellation algorithms [38]. Instead of using a wired backhaul the SCs are connected
to the core network via a macro BS over a wireless backhaul, thereby reducing the de-
25
ployment cost as compared to traditional cellular networks [39]. In this regard, large
antenna arrays employed at the macro BSs provide high directional beamforming for
the SCs [21, 22].
Dense SC deployment has great potential for improving the network capacity, but
it faces some challenges, such as interference mitigation and resource management,
backhaul/fronthaul limitations. In this regard, this thesis addresses joint load balancing
and interference mitigation optimization under wireless backhaul constraints.
1.2.2 Massive MIMO
The basic concept of massive MIMO is to utilize hundreds or thousands of antennas at
the BS to serve up to tens or hundreds of UEs [14, 15, 16, 40]. Massive MIMO has
been recognized as one of the most promising 5G techniques, which yields remarkable
properties such as high signal-to-interference-plus-noise ratio (SINR) due to extreme
spatial multiplexing gains [41, 25, 42]. In addition, the large spatial degree of freedom
(DoF) of massive MIMO enables the mitigation of cross-tier and co-tier interference
through proper hierarchical precoder design at the BS. In massive MIMO sytems, co-
channel time-division duplexing (TDD) is considered in which the macro base stations
and the small cells share the entire bandwidth [21, 15, 43, 42]. In TDD systems, the
channel reciprocity is exploited, and thus, the DL channel can be obtained via the uplink
training phase, which leads to reduced channel training overheads [43, 44]. Importantly,
the channel estimate scales linearly with the number of users and does not depend on
the number of antennas. On the other hand, pilot contamination is considered a ma-
jor performance limiting factor in massive MIMO networks, which occurs when non-
orthogonal pilot sequences are assigned to users. This is why a pilot design should be
taken into account when deploying massive MIMO [45, 46, 47, 48].
1.2.3 Millimeter wave communications
MmWave communications collectively refer to the electromagnetic spectrum between
3 - 300 GHz, which corresponds to wavelengths from 1 mm to 100 mm [9, 10, 1, 13]. A
peculiarity is that mmWave communications experience a high degree of path-loss and
blockage, but have a larger bandwidth with a short wavelength [49, 50, 13]. Thanks to
small wavelengths at higher frequency bands, a large number of antennas can be packed
into a small footprint to achieve highly directional beamforming, which substantially
26
increases link capacity. Large antenna arrays can be deployed at both the transmitter
and receiver, which yields high spatial multiplexing gains and overcome high path-
loss and high noise power (due to the large bandwidth) without additional transmission
power.
In this thesis, the above mentioned 5G technologies are investigated as the key en-
ablers for providing high data rates, low latency, and high reliability. This combination
brings paradigm shifts in terms of system design and fundamental challenges. These
are as follows:
– Integrated access and backhaul: The next generation cellular wireless systems will
be required to dynamically schedule users and efficiently provide a wireless backhaul
to small cells under channel and network dynamics, while satisfying users’ QoS/QoE
requirements [51, 52, 53]. The main contribution of this thesis is to propose an
architecture to support both access and wireless backhaul [54, 55, 56, 23].
– Multi-connectivity and spatial diversity: To improve data rates and reliability,
multi-connectivity and antenna diversity have been studied for decades [57, 58]. For
example, a user can connect to multiple base stations to transmit and receive multiple
copies of data, which improves reliability and capacity [59, 60, 61, 62].
– Load balancing and interference mitigation: The problems of load balancing and
interference mitigation become critical for large number of UEs and BSs [63, 64, 23].
The key questions are how to associate UEs with which BSs, and how to mitigate
both co-tier and cross-tier interference [65, 66, 67, 68].
– Beamforming design and tracking: Beamforming is an important strategy to ob-
tain higher transmission gains and alleviate interference [69, 41, 70, 71]. Last but
not least, in high mobility environments, the problems of beam tracking, mobility
management, and handover are very challenging.
1.3 Scope of the thesis
This thesis consists of seven chapters. The first chapter starts by providing a brief
overview of 5G networks. Following that, the main research questions are formulated.
Then, enablers and challenges concerning 5G technologies are provided. In the second
chapter, the research methodologies used to analyse and optimize the considered net-
work scenarios are introduced. In the next four chapters, the author answers the above
27
research questions. Finally, conclusions are drawn, while highlighting future research
directions. In summary, the contributions of each chapter are as follows:
Chapter 2: The second chapter briefly provides a general background on the mathe-
matical tools used throughout the body of the thesis. Specifically, the basics of
stochastic optimization are introduced to model and solve dynamic network opti-
mization problems. The author then discusses elements of random matrix theory,
which is yet another powerful tool to tackle problems involving high dimensional
data. Due to the non-convex nature of resource allocation problems in wireless
networks, the successive convex approximation technique is used to efficiently
seek local optimal solutions. The final section provides a brief discussion on
reinforcement learning which is instrumental in addressing uncertainty and risky-
events in dynamic stochastic networks.
Chapter 3: This chapter proposes a novel integrated access and backhaul architecture
to study the problem of joint load balancing and interference mitigation in het-
erogeneous networks (HetNets). In particular, a massive MIMO macro cell BS
equipped with a large number of antennas, overlaid with wireless self-backhauled
SCs is assumed. Self-backhauled SC BSs with full-duplex communication em-
ploying regular antenna arrays serve their SC users and offload cell-edge macro
users, by using the wireless backhaul from macro BS in the same frequency band.
The joint load balancing and interference mitigation problem is formulated as
a network utility maximization subject to wireless backhaul constraints. Due
to the non-tractability of the problem, the author applies random matrix theory
to obtain a closed-form expression of the achievable rate and transmit power in
the asymptotic regime, i.e., as the number of antennas and users grows large.
Subsequently, leveraging stochastic optimization, the problem is decoupled into
dynamic scheduling of macro cell users, backhaul provisioning of SCs, and of-
floading macro cell users to SCs as a function of interference and backhaul links.
The proposed algorithm is analysed and validated by taking the impact of SCs
density and transmit power at low and high frequency bands into account.
Chapter 4: In this chapter, a novel solution is proposed to provide Gbps multi-hop
transmissions with latency guarantees. Owing to the severe path loss and unre-
liable transmission over a long distance at higher frequency bands, the author
investigates the problem of path selection and rate allocation for multi-hop self-
28
backhaul mmWave networks. For this purpose, a new system design is advocated
by exploiting multiple antenna diversity, mmWave bandwidth, and traffic split-
ting techniques. The studied problem is cast as a network utility maximization
problem, subject to a probabilistic latency constraint, network stability, and dy-
namics. By leveraging stochastic optimization, the problem is decoupled into:
(i) path selection and (ii) rate allocation sub-problems, whereby a framework
which selects the best paths is proposed using reinforcement learning techniques.
Moreover, the rate allocation is a non-convex program, which is converted into a
convex problem, and solved using the successive convex approximation method.
Chapter 5: This chapter addresses the fundamental question of how to simultaneously
provide orders of magnitude in capacity improvements and latency reduction. In
particular, the problem of low-latency communication (ULC) is investigated in
mmWave-enabled massive multiple-input multiple-output (MIMO) networks. To
address this matter, the Lyapunov optimization framework is extended to incor-
porate probabilistic latency constraints, which takes the queue length, arrival rate,
and channel variations into account. The studied problem is then decoupled into
a dynamic latency control and rate allocation. Here, the latency control problem
is a difference of convex (DC) programming problem, which is solved efficiently
by the convex-concave procedure (CCP).
Chapter 6: In this chapter, another approach is proposed to enhance ultra-reliable
communication (URC) in 5G mmWave massive MIMO networks. In contrast to
the classical network design based on average metrics, our design objective is to
take both the average metrics and variance of the network utility function into ac-
count. Due to the sensitivity of mmWave links, the proposed solution leverages
principles of risk-sensitive reinforcement learning (RSL) and exploits multiple
antenna diversity and higher bandwidth to optimize transmissions and achieve
Gbps data rates. The prime motivation behind using RSL stems from the fact
that the risk-sensitive utility function to be optimized is a function of not only the
average but also the variance, and thus it captures the tail of the rate distribution
to enable URC. To that end, a distributed risk-sensitive reinforcement learning-
based framework is advocated to jointly optimize the beamwidth and transmit
power. Moreover, the proposed algorithm is fully distributed, and does not re-
29
quire full network observation.
Chapter 7: This chapter draws the main conclusions of the thesis and discusses future
research directions.
1.4 Author’s contribution
The author’s research work at the University of Oulu has been published in four journal
papers [23, 24, 72, 73], and two conference papers [74, 75]. The thesis is based on all
these works [23, 24, 72, 73, 74, 75], and provides new radio access solutions to enable
multiple gigabit data rates and ultra-reliable and low latency communications. By ap-
plying advanced signal processing techniques, mathematical optimization frameworks,
and reinforcement learning tools, the research provides important solutions to establish
key trade-offs, between aspects such as capacity and latency, and reliability and net-
work density/traffic loads. As the leading author of all the papers above, the author of
the thesis had the main responsibility in proposing the original ideas, formulating the
problems, deriving the mathematical algorithm, conducting the analysis, developing
and carrying out the simulations, evaluating the numerical results, writing the original
papers, and handling the review process. The co-authors provided invaluable comments,
criticism, and supporting ideas for the research.
30
2 Research methodologies
This chapter describes the mathematical tools used to model and optimize the studied
networks. In particular, the Lyapunov optimization framework, successive convex ap-
proximation method, random matrix theory, and reinforcement learning are sequentially
introduced as follows:
2.1 Stochastic optimization for wireless networks
Stochastic optimization has found applications in wireless networks in the presence of
randomness [76, 77]. For instance, the dynamic nature of wireless channels and stochas-
tic arrivals involves uncertainties and randomness. In the following text, the author
provides a basic introduction of stochastic optimization to model a general stochastic
network optimization problem and solve it by using the Lyapunov drift and a penalty
technique [76, 77].
2.1.1 Queuing networks
Consider a stochastic queuing network that operates in a slotted time t ∈0,1,2, . . . [76,
77]. We assume that there are K queues in the network, and the queuing vector is
Q(t) = (Q1 (t) , · · · ,QK (t)), which stores the data at each time slot t. For instance, one
base station serves up to K users in cellular networks. We first define α (t) as the con-
trol action, i.e., power/spectrum allocation, scheduling, routing, or caching. Let β (t)
denote the random network event, i.e., arrival rate, channels, queue state. Here, Aβ (t)
denotes the set of possible control actions. Let ak (t) = ak(α(t),β (t)) denote the bursty
data arrival for each user k, i.i.d over slot t and its second moment is bounded by some
finite constant. We define the network attribute as x(t) = (x1(t), · · · ,xK(t)) on slot t in
which xk(t) = xk(α(t),β (t)), is referred as the network throughput (serving rate or ad-
mission rate), transmit power, packet drop rate, latency, cost, or profit. We also assume
that the second moment of the network attribute is bounded as the network arrival rate.
We define the network regime R as the convex hull of x(t). The queuing evolution is
given by
Qk (t + 1) = max [Qk (t)− xk (t) ,0]+ ak (t) . (1)
31
ak(t) xk(t)Qk(t)
arrival rate serving rate
Fig. 3. Queuing network model.
Definition 2.1. [Time average expectation] For any vector x(t) = (x1(t), ...,xK(t)), let
x = (x1, · · · , xK) denote the time average expectation of x(t), such that
x , limt→∞1t ∑t−1
τ=0E[x(τ)].
Definition 2.2. [Queue stability] For any discrete queue Q(t) over time slots t ∈ 0,1,
. . . and Q(t) ∈ R+, Q(t) is stable if
Q , limt→∞1t ∑t−1
τ=0E[|Q(τ)|
]< ∞.
A queue network is stable if each queue is stable.
The objective is to determine the actions over time to optimize the following general
stochastic network optimization problem [76, 77]:
max f0 (x) (2a)
subject to g(x)≤ 0, (2b)
i(x) = 0, (2c)
x ∈ R , (2d)
Queue stability, ∀k, (2e)
α(t) ∈ Aβ (t),∀t, (2f)
where f0(x) = ∑Kk=1 ωk f (xk) with ωk(t)≥ 0 is the weight of user k, f (·) is assumed to
be a twice differentiable, concave, and an increasing L-Lipschitz function for all x ∈ R .
g(·) is a continuous convex/non-convex function for all x ∈ R . In addition, i(·) is a
linear (non-linear), continuous function (i.e., power constraint).
2.1.2 Auxiliary variables and virtual queues introduction
To enable the abstract set constraint (2d) to be met and optimize (2) over possibly non-
convex or non-linear functions, we equivalently transform (2) by introducing the auxil-
iary variables ϕϕϕ(t) =(ϕ1(t), . . . ,ϕK(t)
)that satisfy ϕϕϕ(t)≤ x(t), where
ϕk , limt→∞
1
t∑t−1
τ=0Eβ
[ϕk(τ)
]. (3)
32
We can rewrite the constraint functions in (2c) and (2d) as
g(ϕϕϕ), limt→∞
1
t∑t−1
τ=0Eβ
[g(ϕϕϕ (τ))
], (4)
i(ϕϕϕ), limt→∞
1
t∑t−1
τ=0Eβ
[i(ϕϕϕ (τ))
]. (5)
With the above transformation, we convert a function of the time average to a time
average of functions, which makes the problem easier to solve. Thus, we can refine (2)
as follows:
min − f (ϕϕϕ) (6a)
subject to ϕϕϕ(t)≤ x(t), (6b)
g(ϕϕϕ)≤ 0, (6c)
i(ϕϕϕ) = 0, (6d)
ϕϕϕ ∈ R , (6e)
(2e), (2f).
To meet (6b) we introduce a virtual queue vector Y (t) as
Yk(t + 1) = max [Yk(t)+ϕk(t)− xk(t), 0], ∀ k ∈ K . (7)
Next, two virtual queues G(t) and I(t) are defined to replace the inequality constraint
(6c) and the equality constraint (6d), respectively, which are given by
G(t + 1) = max [G(t)+ g(ϕϕϕ(t)), 0]. (8)
I(t + 1) = I(t)+ i(ϕϕϕ(t)). (9)
2.1.3 Lyapunov optimization
Lyapunov drift-plus-penalty technique
The queue backlog vector is defined as ΞΞΞ(t) = [Q(t) ,Y(t) ,G(t) ,I(t)], which involves
constraints (2e) and (6c)-(6e) of the transformed problem (6). Hence, for given ϕϕϕ ∈ R
and α(t) ∈ Aβ (t), the stability of ΞΞΞ(t) yielding all constraints of (6) are held. The
main idea of the Lyapunov optimization is to choose actions, which maximise/minimise
the objective function with respect to the stability of the queues. Here, the Lyapunov
function is written as
33
L(ΞΞΞ(t)) =1
2
[K
∑k=1
(Qk(t)
2 +Yk(t)2)+G(t)2 + I(t)2
]
(10)
For each time slot t, ∆(ΞΞΞ(t)) denotes the Lyapunov drift, which is given by
∆(ΞΞΞ(t)) =E [L(ΞΞΞ(t + 1))−L(ΞΞΞ(t)) |ΞΞΞ(t)] (11)
The solution of (6) is obtained by minimizing the Lyapunov drift and a penalty from the
objective function, given the existing ΞΞΞ(t) and observing β (t) for all t
min ∆(ΞΞΞ(t))−ν ∗E [ f (ϕϕϕ) |ΞΞΞ(t)] . (12)
Here ν is non-negative constant to control the optimal minimization solution. Noting
that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive number a,b, and
thus, by neglecting the index t we have:
(max [Qk − xk, 0]+ ak)2 −Q2
k ≤ 2Qk(ak − xk)+ (ak − xk)2,
max [Yk +ϕk − xk, 0]2 −Y 2k ≤ 2Yk(ϕk − xk)+ (ϕk − xk)
2,
max [G+ g(ϕϕϕ), 0]2 −G2 ≤ 2Gg(ϕϕϕ)+ g(ϕϕϕ)2,
[I+ i(ϕϕϕ)]2 − I2 ≤ 2Ii(ϕϕϕ)+ i(ϕϕϕ)2.
Now, the objective function of (12) is rewritten as
∆(ΞΞΞ(t))−ν ∗E [ f (ϕϕϕ) |ΞΞΞ(t)]≤Ψ+K
∑k=1
Qk (t)E [ak (t)− xk (t) |ΞΞΞ(t)]
+K
∑k=1
Yk (t)E [ϕk(t)− xk (t) |ΞΞΞ(t)] (13)
+G(t)E [g(ϕϕϕ(t))|ΞΞΞ(t)]+ I(t)E [i(ϕϕϕ(t))|ΞΞΞ(t)] ,
where Ψ is a finite constant that satisfies Ψ≥ 1
2
K
∑k=1
E
[(ak(t)−xk(t)
)2|ΞΞΞ(t)]+
1
2
K
∑k=1
E
[(ϕk(t)−
xk(t))2|ΞΞΞ(t)
]+
1
2E
[g(ϕϕϕ(t))2 + i(ϕϕϕ(t))2|ΞΞΞ(t)
], for all t and all possible ΞΞΞ(t).
34
Determining control variables
Note that the solution to (6) is acquired by minimizing the right-hand side (RHS) of (13)
without constant Ψ in every slot t [77]. Finally, we arrange the variables and decouple
the problem into several sub-problems and update the queues accordingly.
Sub-problem 1: Select the auxiliary variables to minimize:
minK
∑k=1
Y (t)ϕk(t)+G(t)g(ϕϕϕ(t))+ I(t)i(ϕϕϕ(t))−ν ∗ f (ϕϕϕ(t)) (14a)
subject to ϕϕϕ ∈ R . (14b)
Sub-problem 2: Choose the actions to satisfy:
minK
∑k=1
−[Qk(t)+Yk(t)
]xk(α(t),β (t)) (15a)
subject to α(t) ∈ Aβ (t),∀t. (15b)
2.2 A successive convex approximation technique
Consider the following non-convex optimization problem:
min f (x) (16a)
subject to g(x)≤ 0, (16b)
x ∈ R , (16c)
where f (·) is assumed to be a twice differentiable, convex function for all x ∈ R . g(·)is a continuous non-convex function for all x ∈ R . We first assume that the non-convex
function has its upper convex approximation function, i.e,
g(x)≤ G(x,y), (17)
where G(x,y) is a convex and continuously differentiable function for x ∈ R and a fixed
parameters y ∈ R .
The main idea of the successive convex approximation technique is to replace the non-
convex function via its proper upper bound for some appropriately chosen parameter
vector y [78]. We require the convex upper bound to satisfy the following properties:
35
Property 2.1. For a given x ∈ R , at every iteration i there exists y(i) := ψ(x(i)) that
satisfies
g(x) ≤ G(x,y(i)), (18a)
g(x(i)) = G(x(i),y(i)), (18b)
∇g(x(i)) = ∇G(x(i),y(i)), (18c)
where ∇g(·) is the gradient of g(·).
In Property 2.1, (18b) and (18c) guarantee that the Karush-Kuhn-Tucker (KKT) op-
timality conditions are satisfied by the convergence points [78]. Moreover, (18a) and
(18b) ensure the feasibility of the iterates and the monotonicity of the objective function.
At each iteration i, for a given starting point x0 ∈ R , which is feasible to (16), by setting
y(i) := ψ(x0), we arrive at the following convex problem:
min f (x) (19a)
subject to G(x,y(i))≤ 0, (19b)
x ∈ R . (19c)
We denote x⋆ as the optimal solution of (19), which is also feasible for (16) due to the
conditions in (18). Thus, x⋆ is used for the feasible point for the next iteration i := i+1.
We set y(i+1) := ψ(x⋆) and x(i+1) := x⋆, and iteratively solve (19) until the convergence
condition is achieved.
We note that f (x⋆) ≤ f (x(i)) for all iterations i, hence, the SCA method produces
a sequence of feasible solutions whose values are monotonically decreasing. The algo-
rithm converges when it is bounded below by a finite limit [78].
2.3 Random matrix theory
In this section, a brief introduction to random matrix theory is provided to deterministi-
cally approximate high dimensional random processes which requires only the knowl-
edge of statistic channel correlation matrices ΘΘΘm. In the context of massive MIMO sys-
tems serving large number of users, the wireless channel propagation is often modeled
as a large random matrix, and thus, random matrix theory (RMT) provides a powerful
tool to characterize the network performance in diverse MIMO scenarios [41, 79].
The author starts by revisiting some important Lemmas when studying large dimen-
sional random matrices as follows:
36
Lemma 2.1. [Matrix inversion] Let H be an N×N invertible matrix and x ∈ CN , c ∈ C
for which H+ cxx† is invertible. We have
x†(H+ cxx†
)−1=
x†H−1
1+ cx†H−1x. (20)
Lemma 2.2. [Resolvent identity] Let H and W be two invertible complex matrices of
size N ×N. We have
H−1 −W−1 =−H−1 (H−W)W−1. (21)
Lemma 2.3. Let A1,A2, · · · , with AN ∈ CN×N be a series of random matrices gen-
erated by the probability space (Ω,F ,P) such that, for w ∈ A ⊂ Ω, with P(A) = 1,
‖AN (w)‖ < K (w), uniformly on N. Let x1,x2, · · · , with xN ∈ CN , be random vectors
of i.i.d. entries with a zero mean, a variance 1/
N, and eighth-order moment of order
O(1/
N4), independent of AN . Then
x†NANxN − 1
NTrAN
N→∞−−−→ 0 (22)
almost surely.
Lemma 2.4. Let AN be as in Lemma 2.3 and let xN ,yN ∈ CN be random, mutually
independent with standard i.i.d. entries of zero mean, with a variance of 1/
N, and
eighth-order moment of order O(1/
N4), independent of AN . Then
y†NANxN
N→∞−−−→ 0 (23)
almost surely.
Lemma 2.5. Let A1,A2, · · · , with AN ∈CN×N be deterministic with a uniformly bounded
spectral norm and let B1,B2, · · · , with BN ∈ CN×N , be a random Hermitian, with eigen-
values λ BN1 6 · · ·6 λ BN
N such that, with a probability of 1, there exist ε > 0 for λ BN1 > ε
for all large N. The for v ∈ CN
1
NTrANB−1
N − 1
NTrAN
(BN + vv†
)−1 N→∞−−−→ 0 (24)
almost surely, where B−1N and
(BN + vv†
)−1exist with a probability of 1.
Next, Theorem 2.1 is introduced to deterministically approximate the random ma-
trices, which resulting in a closed-form expression.
37
Theorem 2.1. [A deterministic approximation of random matrix]
Let BN =X†NXN +SN with SN ∈C
N×N Hermitian nonnegative definite and XN ∈Cn×N
random. The ith column xi of XH
N is xi=ΨΨΨiyi, where the entries of yi∈Cri are i.i.d. of the
zero mean, a variance of 1/N. The matrices ΨΨΨi∈CN×ri are deterministic. Furthermore,
let ΘΘΘi=ΨΨΨiΨΨΨH
i ∈CN×N and define QN∈CN×N deterministic. Assume limsupN→∞ sup1≤i≤n‖ΘΘΘi‖<∞ and let QN have uniformly bounded spectral norm (with respect to N). We define the
random matric identity, which is approximated later as follows:
mBN ,QN(z),
1
NTrQN (BN − zIN)
−1 . (25)
Under the assumptions that, for z∈C \R+, as n,N grow large with ratios βN,i ,N/ri
and βN , N/n such that 0 < liminfN βN,i ≤ limsupN βN,i < ∞ and 0 < liminfN βN ≤limsupN βN <∞, we get the closed-form expression of (25) as
mBN ,QN(z)−m
BN ,QN(z)
N→∞−→ 0, (26)
almost surely, with mBN ,QN
(z) given by
mBN ,QN
(z)=1
NTrQN
(
1
N
n
∑j=1
ΘΘΘ j
1+eN, j(z)+SN−zIN
)−1
(27)
where the functions eN,1(z), . . . ,eN,n(z) form the unique solution
eN,i(z) =1
NTrΘΘΘi
(
1
N
n
∑j=1
ΘΘΘ j
1+eN, j(z)+SN−zIN
)−1
(28)
which is the Stieltjes transformation of a nonnegative finite measure on R+. Moreover,
for z<0, the scalars eN,1(z), . . . ,eN,n(z) are the unique nonnegative solutions to (28).
2.4 Reinforcement learning
Reinforcement learning is an area of machine learning in which agents perform actions
to interact with the environment so as to maximize the cumulative reward [80]. By
evaluating feedback from theirs own actions and experiences, the agents determine a
sequence of best actions which maximize the long-term reward.
Basically, reinforcement learning is concerned with decision making to enable the
adaptation and self-organization, and the agents spend time discovering actions to find
the best strategies, then exploit them in the long run. At each time slot t, each agent
38
selects an action from a possible action set, the agent observes the environment and
experiences the reward as shown in Fig. 4. In the next time slot t+1, the agent evaluates
the decision, which is made from the previous time slot and the agent selects the action
based on the distribution of the action-reward. Here, the concept of regret strategy
is employed, defined as the difference between the average utility when choosing the
same actions in previous times, and its average utility obtained by constantly selecting
different actions. The premise is that regret should be minimized over time so as to
choose the best sequence of actions.
Agent
Action (t)Observation
Environment
Reward (t)
NewState(t+1)
t t+ 10 T − 1 T
New State
Feedback
Uplink training phase Downlink transmission phase Uplink transmission and feedback phase
Time indices for each Episode
t t+ 1
:::
Episode 1
Episode 2
Episode 3
NLOS
LOS
NLOS Episode representation for simulation
:::
Fig. 4. Reinforcement learning model ([73] c©2018 IEEE).
The important elements of reinforcement learning include agents, actions, reward
function, policy and environment, which are briefly described as follows:
– Agents can be network operators, base stations, or users, who want to maximize their
cumulative reward functions.
– Actions are defined as a set of things that agents do to solve their concerns with the
environments. In the context of resource allocation, actions could consist of user
association, power assignment, or beamwidth selection.
– Reward function is defined as the cumulative return for the agent after applying
selected actions to the environment. Network utility function and power consumption
are common metrics used to measure the reward.
39
– Policy refers to strategies that the agents play to determine next action based on the
distribution of actions-rewards. It is a mapping between action and state. Here, a
state is the current condition of the environment such as the channel state, or network
queuing state.
– The environment contains the network system, where the agents play their actions
to maximize the reward. At the beginning of each time slot, the agents observe the
reward, which reflects the noise and interference in the environment.
40
3 Integrated access and backhaul architecture
From this chapter, each research question is answered sequentially. In particular, this
chapter addresses the first question Q1 by proposing an integrated access-backhaul
(IAB) framework to dynamically schedule users, while efficiently providing a wireless
backhaul to dense small cells and mitigating interference. In addition, joint resource al-
location and interference mitigation solutions are proposed for two-hop self-backhauled
networks.
3.1 Main contributions and related work
The main contributions are lised as follows
– The problem of joint load balancing (user association and user scheduling) and inter-
ference management (beamforming design and power allocation) for 5G HetNets is
modelled in which a DL scheduler is designed at the MBS to schedule macro UEs and
provide a backhaul to FD-enabled SCs, with FD capable SCs serve both MUEs and
small cell UEs in the same frequency band. Moreover, an interference management
scheme is proposed to mitigate both co-tier and cross-tier interference from the MBS
and FD-enabled SCs by designing a hierarchical precoding scheme and controlling
the transmission of the SCs. The problem is cast as a network utility maximization
(NUM) problem subject to dynamic wireless backhaul constraints, traffic load, and
imperfect channel state information (CSI). To make the problem tractable, by invok-
ing results from random matrix theory (RMT), we derive a closed-form expression
of the signal-to-interference-plus-noise-ratio (SINR) and transmit power when the
numbers of MBS antennas and users grow very large.
– A Lyapunov framework is applied to solve the NUM problem in polynomial time.
The NUM problem is decomposed into the dynamic scheduling of MUEs, as well as
the backhaul provisioning of FD-enabled SCs, and offloading MUEs to FD-enabled
SCs. The joint load balancing and operation mode (FD or half-duplex) subproblem,
which is a non-convex program with binary variables, is converted into a convex
program by using the successive convex approximation (SCA) method. The motiva-
tions for using a SCA are its low complexity and fast convergence, and the obtained
solution, which yields many relaxed variables is close to zero or one.
41
– A performance evaluation is carried out to compare the proposed algorithm with other
baselines under the impact of SC density and transmit power levels at low/high fre-
quency bands. A comprehensive performance analysis of our proposed algorithm
based on the Lyapunov framework is provided in Appendix 1. There exists an
[O(1/ν),O(ν)] utility-queue backlog tradeoff, which leads to an utility-latency bal-
ancing [72], where ν is the Lyapunov control parameter. Moreover, a convergence
analysis of the approximation method based on the SCA method is studied.
Related work
An overview of cellular backhaul technologies and identified design and challenges was
studied in [54]. Recent work on the mmWave access and backhauling for 5G commu-
nication systems is discussed in [52]. The Xhaul architecture presented in [52] aims
to develop a 5G integrated backhaul and fronthaul transport network enabling flexible
and software-defined reconfiguration of all networking elements in a multi-tenant and
service-oriented unified management environment. As pointed out in [67, 68], the cur-
rent solutions for user association problems ignore the backhaul constraints, which are
crucial since the capacity of open access SCs with either wired or wireless backhaul
always face the limited backhaul constraint.
Moreover, the load balancing problem should take imperfect CSI into account due
to mobility, which is ignored in the previous work. Our previous work in [74] consid-
ered the problem of joint in-band scheduling and interference mitigation in 5G HetNets
without considering the user association. In this chapter, we extend [74] by considering
the load balancing problem taking into account the backhaul constraint and imperfect
CSI, and further this chapter provides insights into the performance analysis of our
proposed algorithm based on the Lyapunov framework and convergence of the SCA
method.
The rest of this chapter is organized as follows. Section 3.2 describes the system
model and Section 5.3 provides the problem formulation for load balancing and inter-
ference mitigation. Section 5.4 introduces the Lyapunov framework used to solve our
problem. In Section 5.5, we present the numerical results. We conclude the chapter in
Section 3.6.
42
3.2 System model
The downlink (DL) transmission of a HetNet scenario is considered as shown in Fig. 1
in which a MBS b0 is underlaid with a set of uniformly deployed S FD-enabled SCs,
S = bs|s ∈ 1, . . . ,S. Let B = b0∪ S denote the set of all base stations, where
|B|= 1+S. The MBS is equipped with N number of antennas and serves a set of single-
antenna M MUEs M = 1, . . . ,M. Let K = M ∪ S denote the set of users associated
with MBS b0, where |K | = K = M+ S. The user indices k = 1,2, ...,M represent the
corresponding MUE indices m= 1,2, ...,M, while the user indices k = M+1,M+2, ...,
M+S represent the corresponding SC indices s = 1,2, ...,S. We assume an open access
policy at the FD-enabled SCs and each FD-enabled SC is assumed to be equipped with
Ns + 1 antennas: one receiving antenna is used for the wireless backhaul and Ns trans-
mitting antennas to serve its single-antenna small cell UEs (SUEs) or other MUEs at the
same frequency band. Let C = c1,c2, . . . ,cS denote the set of SUEs, where |C | = S.
Moreover, the SCs are assumed to be FD capable with perfect self-interference can-
celation (SIC) capabilities [31, 33, 81]. A co-channel time-division duplexing (TDD)
protocol is considered in which the MBS and FD-enabled SCs share the entire band-
width, and the DL transmission occurs at the same time. In this work, we consider
a large number of antennas at both the macro and SC BSs and a dense deployment
of MUEs and SCs, such that M,N,Ns,S ≫ 1. We denote h(b0)m =
[h(b0,1)m ,h
(b0,2)m , · · · ,
h(b0,N)m
]T ∈ CN×1 as the propagation channel between the mth MUE and the antennas
of the MBS b0 in which h(b0,n)m is the channel between the mth MUE and the nth MBS
antenna. Let H(b0),M =[h(b0)1 ,h
(b0)2 , · · · ,h(b0)
M
]∈ CN×M denote the channel matrix be-
tween all MUEs and the MBS antennas. Moreover, we assume imperfect CSI for the
MUEs due to mobility and we denote H(b0),M =[h(b0)1 , h
(b0)2 , · · · , h(b0)
M
]∈ C
N×M as the
estimate of H(b0),M in which the imperfect CSI can be modeled as [16]:
h(b0)m =
√
NΘΘΘ(b0)m w
(b0)m , (29)
where w(b0)m =
√
1− τm2w
(b0)m +τmz
(b0)m is the estimate of the small-scale fading channel
matrix and ΘΘΘ(b0)m is the spatial channel correlation matrix that accounts for the path loss
and shadow fading. Note that due to limited spatial scattering in the MIMO channel, the
rank of the correlation matrix is much small than number of antennas, i.e., Rank(X)≤N.
While the spatial channel model is clustered, which belongs to a finite set with a finite
size [25]. Here, w(b0)m and z
(b0)m are the real channel and the channel noise, respectively,
modelled as a Gaussian random matrix with zero mean and variance 1/N. The channel
43
estimate error of MUE m is denoted by τm,τm ∈ [0,1]; in case of perfect CSI, τm = 0.
Similarly, let H(b0),S ∈ CN×S and H(b0),C ∈ C
N×S denote the channel matrices from the
MBS antennas to the SCs and SUEs, respectively. Let h(bs)u ∈ CNs×1 denote the channel
propagation from SC bs to any receiver u. Let cs denote the SUE served by the SC bs.
3.3 Load balancing and interference mitigation
In this section, we formulate the joint optimization of user association, user scheduling,
beamforming design, and power allocation. To that end, we first derive the received
signal, data rate, and power transmit for each receiver (SCs are also treated as macro
BS’s UEs). We then formulate the problem as a network utility maximization subject to
wireless backhaul constraints. However, the formulated problem does not have closed-
form expressions for the objective and constraints. Hence, we apply RMT [41] to obtain
these closed-form expressions. We finally utilize the tool of stochastic optimization to
decouple our problem into several solvable sub-problems.
The problem of user scheduling and user association for load balancing in the DL is
addressed in which the MBS simultaneously provides data transmission to MUEs and
wireless backhaul to the FD-enabled SCs, while the SCs with an FD capability serve
both SUEs and MUEs. For each MUE m ∈ M , let binary variable l(bs)m indicate the
transmission association from BS bs ∈ B to MUE m, i.e., l(bs)m = 1 when MUE m is
associated with BS bs, otherwise l(bs)m = 0. Similarly, let binary variables l
(b0)s+M and l
(bs)cs
denote the transmission association indicators from MBS b0 to SC s and from SC bs to
SUE cs, respectively. We assume that each MUE m connects to one BS (either MBS
b0 or SC bs) at time slot t. Each SC is equipped with Ns transmitting antennas, and we
assume that each SC serves up to Naus active users (either SUE or MUE) at each time
slot, such that Naus ≤ Ns, where the superscript au stands for “active users". Hence, we
have the following constraints for load balancing:
∑Ss=0 l
(bs)m ≤ 1,∑M
m=1 l(bs)m + l
(bs)cs ≤ Nau
s ,∀ s,m ∈ K . (30)
We define vector l =
l(bs)j |bs ∈ B, j ∈ M ∪ S ∪C
containing all transmission indi-
cators between BSs and UEs. Let Ntxs = ∑M
m=1 l(bs)m + l
(bs)cs be the total number of the
transmissions at SC, where superscript tx stands for “transmissions", and thus the latter
of (30) becomes Ntxs ≤ Nau
s ,∀s ∈ S .
44
3.3.1 Downlink transmission
The MBS serves two types of users: MUEs with imperfect CSI and FD-enabled SCs
with perfect CSI. Let p(b0)m , p
(b0)s+M, and P(b0) denote the DL MBS transmit power as-
signed to MUE m, the DL MBS transmit power assigned to SC s, and the maximum
transmit power at the MBS, respectively. We focus on the multiple-input single-output
(MISO) channel, where the MBS with N antennas can serve K UEs. Here, we take into
account user scheduling and association, and our proposal can apply to any special case
when number of UEs is larger than number of antennas, i.e., K > N. SC exploits FD
capability to double capacity, FD-enabled SC causes unwanted FD interference: this
results in cross-tier interference to the adjacent MUEs (or other SCs), and co-tier inter-
ference to other UEs. Hence, in order to convert the interference channel to the MISO
channel, we design a precoder at the MBS and propose an operation mode policy to
control the FD interference to treat the total FD interference as additional noise.
Definition 3.1. [Operation Mode Policy] We define φφφ as the operation mode to control
the FD-enabled SC transmission to reduce FD interference. The operation mode is
expressed as φφφ(t) = φ (bs)(t) | φ (bs)(t) ∈ 0,1,∀s ∈ S. Here, φ (bs)(t) = 1 indicates
SC bs operates in FD mode and φ (bs)(t) = 0 for half-duplex (HD) mode.
We assume that the MBS uses a precoding scheme, V = [v1,v2, . . . ,vK] ∈ CN×K.
To exploit the degrees of freedom of massive MIMO, the hierarchical interference
mitigation scheme in [25, 82] is applied to design the precoder, i.e., V = UT, where
T ∈ CN×Nitf is used to control the co-tier interference and capture the spatial multiplex-
ing gain, and U ∈ CNitf×K is used to suppress cross-tier interference. Here, Nitf < N,
where the subscript itf stands for “interference". The precoder U is chosen such that
U† ∑Ss=1 φ (bs)ΘΘΘ(b0)
s = 0, (31)
where ΘΘΘ(b0)s ∈ CN×N is the sum of the correlation matrices between the MBS anten-
nas and the users belong to SC s. Here, U is in the null space of ∑Ss=1 φ (bs)ΘΘΘ(b0)
s .
Note that φ (bs) determines that the transmission of the FD-enabled SC is enabled or
not. The precoder T is designed to adapt to the real time CSI based on H†U ∈ CK×Nitf ,
where H = [h(b0)]†k∈K . In this chapter, we consider the regularized zero-forcing (RZF)
precoding1 that is given by T =(U†H†HU+Nζ INitf
)−1U†H†, where the regulariza-
tion parameter ζ > 0 is scaled by N to ensure that the matrix U†H†HU + Nζ INitf
1Other (hybrid) precoders are left for future work, the analogue beamforming design is not introduced in this
chapter, and we assumed that the analogue beamforming gain is normalized to one.
45
is well conditioned as N → ∞. The precoder T is chosen to satisfy the power con-
straint Tr(PT†T
)≤ P(b0), where P = diag(p
(b0)1 , p
(b0)2 , . . . , p
(b0)K ). We also assume that
each SC uses ZF precoding to server its users, F(bs) = [f(bs)1 , f
(bs)2 , . . . , f
(bs)
Ntxs] ∈ CNs×Ntx
s
which reads f(bs)u = h
(bs)†u
(h(bs)u h
(bs)†u
)−1such that F(bs) is chosen to satisfy the equal-
ity power constraint Tr(P(bs)F(bs)†F(bs)
)= P(bs)2. Here, P(bs) = diag(p
(bs)1 , p
(bs)2 , . . . ,
p(bs)
Ntxs). The channel propagation from the SC bs to the MUE m (referred to as user u) is
h(bs)u = h
(bs)m =
√
NsΘΘΘ(bs)m
(√
1− τm2w
(bs)m + τmz
(bs)m
), where ΘΘΘ(bs)
m ∈ CNs×Ns is the chan-
nel correlation matrix. Here, w(bs)m and z
(bs)m are the real channel and the channel noise
from SC bs to MUE m, respectively, modelled as a Gaussian random matrix with a zero
mean and a variance of 1/Ns.
By utilizing a massive number of antennas at the MBS, a large spatial degree of
freedom is utilized to serve MUEs and FD-enabled SCs, while the remaining degrees
of freedom are used to mitigate cross-tier interference. In a massive MIMO system,
the total number of antennas is considered as the degree of freedom [25]. Hence,
we have the antenna constraint for user association and an operation mode such that
∑Kk=1 l
(b0)k (t)+∑S
s=1 Ntxs (t) ≤ N. For notational simplicity, we remove the time depen-
dency from the symbols throughout the discussion. The received signal y(b0)m at each
MUE m ∈ M at time instant t is given by
y(b0)m = l
(b0)m
√
p(b0)m h
(b0)†m vmx
(b0)m
+S
∑s=1
φ (bs) ∑Ntx
su=1 l
(bs)u
√
p(bs)u h
(bs)†m f
(bs)u x
(bs)u
︸ ︷︷ ︸
cross-tier interference
+K
∑k=1,k 6=m
l(b0)k
√
p(b0)k h
(b0)†m vkx
(b0)k
︸ ︷︷ ︸
co-tier interference
+ηm,
(32)
where x(b0)m is the signal symbol from the MBS to the MUE m, vm is the precoding
vectors of MUE m, and ηm ∼ CN (0,σ2) is the thermal noise at MUE m. While x(bs)u is
the transmit signal symbol from SC bs to its user u.
2We chose the equality constraints for the transmit power at the SCs to reach the optimal rate at maximum
power rather than using Tr(P(bs)F(bs)†F(bs)
)≤ P(bs), since the power at the SCs is relatively small.
46
At time instant t, the received signal y(b0)s+M at each SC s ∈ K suffers from self-
interference, as well as cross-tier and co-tier interference, which is given by
y(b0)s+M = l
(b0)s+M
√
p(b0)s+Mh
(b0)†s+M vs+Mx
(b0)s+M
+S
∑s′=1,s′ 6=s
φ (bs′ )Ntx
s′∑
u′=1
l(bs′ )u′
√
p(bs′ )u′ h
(bs′ )†s f
(bs′ )u′ x
(bs′ )u′
︸ ︷︷ ︸
cross-tier interference
+φ (bs)Ntx
s
∑u=1
l(bs)u
√
p(bs)u h
(bs)†s f
(bs)u x
(bs)u
︸ ︷︷ ︸
self-interference
+K
∑k=1,k 6=s+M
l(b0)k
√
p(b0)k h
(b0)†s+M vkx
(b0)k
︸ ︷︷ ︸
co-tier interference
+ηs+M,
(33)
where x(b0)s+M is the signal symbol from the MBS to the SC s, vs+M are the precoding
vectors of SC s, and ηs+M ∼ CN (0,σ2) is the thermal noise of the SC s.
The received signal from the SC bs at receiver u, y(bs)u = 0, if the SC bs operates in
HD mode, φ (bs) = 0. For FD mode, φ (bs) = 1, the received signal y(bs)u is given by
y(bs)u = φ (bs)l
(bs)u
√
p(bs)u h
(bs)†u f
(bs)u x
(bs)u
+S
∑s′=1,s′ 6=s
φ (bs′ )Ntx
s′∑
u′=1,l(bs′ )u′
√
p(bs′ )u′ h
(bs′ )†u f
(b′s)u′ x
(bs′ )u′
︸ ︷︷ ︸
co-tier interference
+φ (bs)Ntx
s
∑j=1, j 6=u
l(bs)j
√
p(bs)j h
(bs)†u f
(bs)j x
(bs)u
︸ ︷︷ ︸
co-tier self-interference
+K
∑k=1,k 6=u
l(b0)k
√
p(b0)k h
(b0)†u vkx
(b0)k
︸ ︷︷ ︸
cross-tier interference
+ηu,
(34)
where x(bs)u is the transmit data symbol from the SC bs to receiver u and ηu ∼ CN (0,σ2)
is the thermal noise at receiver u. We imply that the receiver u can be either a SUE or
an MUE.
The precoder V is designed at the MBS to null the co-tier interference and to com-
pletely remove the cross-tier interference to SCs’s users (31) and the self-interference
47
γ(b0)m =
l(b0)m p
(b0)m |h(b0)†
m vm|2
∑k 6=m l(b0)k p
(b0)k |h(b0)†
m vk|2 +∑s φ (bs)P(bs)|h(bs)†m |2 +σ2
. (35)
γ(b0)s+M =
l(b0)s+M p
(b0)s+M|h(b0)†
s+M vs+M|2
∑k 6=s+M l(b0)k p
(b0)k |h(b0)†
s+M vk|2 +∑s′ 6=s φ (bs′ )P(bs′ )|h(bs′ )†s |2 +σ2
. (36)
γ(bs)u =
φ (bs)l(bs)u p
(bs)u |h(bs)†
u f(bs)u |2
φ (bs) ∑ j=1, j 6=u l(bs)j p
(bs)j |h(bs)†
u f(bs)j |2 +∑s′ 6=s φ (bs′ )P(bs′ )|h(bs′ )†
u |2 +σ2. (37)
is well treated, while Tr(P(bs)F(bs)†F(bs)
)= P(bs). Thus, according to (32)-(34), the
SINRs of an MUE m served by an MBS, an SC s served by an MBS, a receiver u served
by an SC are given in (35)-(37), respectively.
3.3.2 Joint load balancing and interference mitigation
Let us consider a joint optimization of load balancing l, operation mode φφφ , interference
mitigation U, and transmit power allocation p = (p(b0)1 , p
(b0)2 , . . . , p
(b0)K ) that satisfies the
transmit power budget of MBS i.e. , Tr(PT†T
)≤ P(b0). We define k
(bs)k =
P(bs)|h(bs)†k
|2σ 2
and ko as the FD interference to noise ratio (INR) from an FD-enabled SC bs to any
scheduled receiver k, and the allowed FD INR threshold, respectively. The FD interfer-
ence threshold is defined such that ∑Kk=1 ∑S
s=1k(bs)k ≤ ko, so that the total FD interfer-
ence is considered as noise. Under the operation mode policy, we schedule the receiver
i and enable the transmission of SC bs as long as ∑Kk=1 ∑S
s=1 l(b0)k φ (bs)k
(bs)k ≤ ko. Let
ΛΛΛo = l,φφφ be a composite control variable of user association and operation mode.
We define ΛΛΛ = ΛΛΛo,U,p as a composite control variable, which adapts to the spatial
channel correlation matrix ΘΘΘ.
For a given ΘΘΘ that satisfies (31) and operation mode policy, the respective Ergodic
data rates of SC s and SUE u are rs+M(ΛΛΛ|ΘΘΘ) = E[
log(1+ γ
(b0)s+M
)]and r
(bs)u (ΛΛΛ|ΘΘΘ) =
E
[log(1+ γ
(bs)u
)]. While from the constraint (30) the Ergodic data rate of MUE m will
depend on which BS the MUE is associated with, i.e., rm(ΛΛΛ|ΘΘΘ) =E[
log(1+ γ
(b0)m
)]+
S
∑s=1
minE[
log(1+ γ
(bs)m
)], rs(ΛΛΛ|ΘΘΘ)− ∑
u 6=m
r(bs)u (ΛΛΛ|ΘΘΘ). In other words, the first term
is the data rate from from the MBS to MUE when the MUE is associated with the
MBS, while the second term is when the FD-enabled SCs allow the MUE to connect (If
48
the MUE is connected to the FD-enabled SC, then the rate of the MUE should be the
minimum between r(bs)m (ΛΛΛ|ΘΘΘ) and data stream from the MBS via FD-enabled SC to the
MUE, excepts other SC’s users).
For a given composite control variable ΛΛΛ that adapts to the spatial channel correla-
tion matrix ΘΘΘ, the average data rate region is defined as the convex hull of the average
data rate of users, which is expressed as:
R ,
r(ΛΛΛ|ΘΘΘ) ∈ RK+ | l ∈ 0,1K+MS+S,φφφ ∈ 0,1S,
∑Ss=0 l
(bs)m ≤ 1, ∀ m ∈ M ,
∑Mm=1 l
(bs)m + l
(bs)cs = Ntx
s ,Ntxs ≤ Nau
s , ∀ bs ∈ S ,
∑Kk=1 l
(b0)k +∑S
s=1 Ntxs ≤ N,
∑Kk=1 ∑S
s=1 l(b0)k φ (bs)k
(bs)k ≤ ko,
Tr(PT†T
)≤ P(b0), U† ∑S
s=1 φ (bs)ΘΘΘ(b0)s = 0
,
where r(ΛΛΛ|ΘΘΘ) = (r1(ΛΛΛ|ΘΘΘ), . . . , rK(ΛΛΛ|ΘΘΘ))T . Following the results from [83], the bound-
ary points of the rate regime with total power constraint and no self-interference are
Pareto-optimal3. Moreover, according to [84, Proposition 1], if the INR covariance ma-
trices approach the identity matrix, the Pareto rate regime of the MIMO interference
system is convex. Hence, our rate regime is a Pareto-optimal, and thus is convex with
the above constraints.
Let us assume that each FD-enabled SC acts as a relay to forward data to its users.
If the MBS transmits data to an FD-enabled SC bs, but the transmission of SC bs is
disabled, it cannot serve its SUE. Hence, we define D(t) = (D1(t),D2(t), . . . ,DS(t))
as a data queue at the SCs, where at each time slot t, the wireless backhaul queue at
FD-enabled SC bs is
Ds(t + 1) = max[Ds(t)+ rs+M(t)− r(bs)cs (t), 0], ∀ s ∈ S . (38)
The SC offloads some MUEs from the MBS if the wireless backhaul capacity between
the SCs and the MBS is guaranteed, and hence, for each SC we have the following
wireless backhaul condition for all t ≥ 0: “If the access link between the MUE m and
the MBS is better than the link between the MUE m and the SCs, then the MUE connects
3The Pareto optimal is the set of user rates at which it is impossible to improve any of the rates without
simultaneously decreasing at least one of the others.
49
with the MBS rather than with other SCs", i.e.,4
if rs+M(t)≤ r(b0)m (t), then ł
(bs)m = 0, ∀s ∈ S ,m ∈ K. (39)
We define the network utility function f0(·) to be non-decreasing and concave over
the convex region R for a given ΘΘΘ. The objective is to maximize the network utility
under wireless backhaul constraints and imperfect CSI. Thus, the NUM problem is
given by,
OP1:maxr
f0(r) (40a)
subject to (39), r ∈ R , D < ∞, (40b)
where f0(r) = ∑Kk=1 ωk(t) f (rk) with ωk(t) ≥ 0 is the weight of user k, f (·) is assumed
to be twice differentiable, concave, and increasing L-Lipschitz function for all r ≥ 0.
Solving (40) is non-trivial since the average rate region R does not have a tractable
form. To overcome this challenge, we need to find closed-form expressions of the data
rate and the average transmit power. Inspired by [41], we invoke RMT to get the closed-
form expressions for the user data rate and transmit power as N ≫ K.
3.3.3 Closed-form expression via a deterministic equivalent
We invoke recent results from RMT to get the deterministic equivalent of the user rate
and transmit power via Theorem 3.1.
Theorem 3.1. Recall that ζ is the RZF parameter. As N ≫ K; N,K → ∞, by applying
the technique in [41, Theorem 2], the deterministic equivalent of the asymptotic SINR
of MUE m is
γ(b0)m
a.s.−−→ l(b0)m p
(b0)m (1− τ2
m)(Ωm)2
Φ,
wherea.s.−−→ denotes the almost sure convergence and Φ=ϒm
[
ζ 2−τ2m
(ζ 2−(ζ +Ωm)
2)]
+
(ζ +Ωm)2(σ2 +∑S
s=1 φ (bs)k(bs)m ). Here, Ωm = 1
NTr(ΘΘΘmG) forms the unique positive so-
lution of which is the Stieltjes is a transformation of nonnegative finite measure [41, The-
orem 1], where G=(
1N ∑K
k=1ΘΘΘk
ζ+Ωk+INitf
)−1
. In addition, ϒm = 1N ∑K
k=1,k 6=m
ζ 2l(b0)k
p(b0)k
ekm
(ζ+Ωk)2 ,
4The queues of MUEs are handled at the MBS and the SCs strictly handle data for SUEs. Hence when the
SCs open a connection for the MUEs, they should have immediate capacity in terms of data rate. We do not
include the constraint (39) for the closed access case in [74].
50
and ΘΘΘk = UU†ΘΘΘ(b0)k UU†. e = [ek],k ∈ K , and em = [emk],k ∈ K are given by e =
(I−J)−1u, ek =(I−J)−1uk, where J= [Ji j], i, j ∈K . u= [uk],k∈K , um = [umk],k ∈K
are given by Ji j =1N
trΘΘΘiGΘΘΘ jG
N(ζ +Ω j)2
, umk =1
ζ 2NtrΘΘΘkGΘΘΘmG, uk =
1ζ 2N
trΘΘΘkG2. Similarly,
the SINR of SC bs is
γ(b0)s
a.s.−−→ l(b0)s p
(b0)s (Ωs)
2
ζ 2ϒs +(ζ +Ωs)2(σ2 +∑Ss′=1,s′ 6=s φ (bs′ )k
(bs′ )s )
.
The power constraint at the MBS can be calculated as 1N ∑K
k=1
p(b0)k
ζ 2ek
(ζ+Ωk)2 − P(b0) ≤ 0.
Moreover, following the analysis in the proof of [41, Theorem 3], [25, Lemma 6] for a
small fixed ζ > 0, ϒk = O(1) and ζ 2ek = Ωk +O(ζ ) yield the deterministic equivalent
of the asymptotic SINRs of UEs (35)-(37) as
γ(b0)m (ΛΛΛ|ΘΘΘ)
a.s.−−→ l(b0)m p
(b0)m (1−τ2
m)
σ 2+∑Ss=1 φ (bs)k
(bs)m
, (41)
γ(b0)s (ΛΛΛ|ΘΘΘ)
a.s.−−→ l(b0)s p
(b0)s
σ 2+∑Ss′=1,s′ 6=s φ (bs′ )k
(bs′ )s
, (42)
γ(bs)u (ΛΛΛ|ΘΘΘ)
a.s.−−→ φ (bs)l(bs)u p
(bs)u
σ 2+∑Ss′=1,s′ 6=s φ (bs′ )k
(bs′ )u
. (43)
Moreover, we obtain the closed-form expression for the transmit power constraint, i.e.,
1
N∑K
k=1
p(b0)kΩ
k−P(b0) ≤ 0.
Although the closed-form expressions of the average data rate and transmit power
are obtained, it is still challenging to solve our predefined problem (40), since (40) con-
siders an optimization of a function of the time-average with a large number of control
variables, and a dynamic traffic load over the convex region for a given composite con-
trol variable ΛΛΛ and ΘΘΘ. The Lyapunov stochastic optimization is a powerful tool to (i)
transform a problem of a function of the time average into a problem of time average
of a function, and (ii) decouple complex problems into several simple sub-problems.
Moreover, our aim is to maximize the aggregate network utility subject to queue stabil-
ity in which the Lyapunov stochastic optimization yields an utility throughput optimal-
ity and stability [77]. Hence, we apply the drift-plus-penalty technique [77] to find the
solutions for load balancing, operation mode selection, and power allocation problems.
51
3.4 Proposed load balancing and interference mitigation
We assume that the network system is modelled as a queueing network that operates in
discrete time t ∈ 0,1,2, . . .. Let ak(t) denote the bursty data arrival destined for each
user k, i.i.d over time slot t. Let Q(t) denote the vector of transmission queue backlogs
at the MBS at slot t. The queue evolution is given by
Qk(t + 1) = max [Qk(t)− rk(t), 0]+ ak(t), ∀ k ∈ K . (44)
Here, we consider the bound of the traffic arrival of user k is bounded so that 0≤ ak(t)≤amax
k , for some constant amaxk < ∞. Furthermore, let rmax
k (t) be the upper bound of the
data rate for user k at time slot t, such that rmaxk (t)≤ amax
k . The set at constraint (40b) is
replaced by an another equivalent set by introducing the auxiliary variables ϕϕϕ(t) ∈ R ,
ϕϕϕ(t) =(ϕ1(t), . . . ,ϕK(t)
)that satisfies ϕk ≤ rk, where ϕk , limt→∞
1t ∑t−1
τ=0E[ϕk(τ)
].
The evolution of the wireless backhaul queue is rewritten as
Ds(t + 1) = max [Ds(t)+ϕs+M(t)− r(bs)cs (t), 0], ∀ s ∈ S . (45)
For a given ΛΛΛ and ΘΘΘ, the optimization problem (40) subject to the network stability and
dynamic backhaul can be posed as
RP1:minϕϕϕ
− f0(ϕϕϕ) (46a)
subject to ϕk − rk ≤ 0, ∀ k ∈ K , (46b)
(39), D < ∞,Q < ∞. (46c)
In order to ensure the inequality constraint (46b), we introduce a virtual queue vector
Y (t) which evolves as follows
Yk(t + 1) = max [Yk(t)+ϕk(t)− rk(t), 0], ∀ k ∈ K . (47)
We define the queue backlog vector as ΞΞΞ(t) =[Q(t),Y(t),D(t)
](whereas the stability
of ΞΞΞ(t) yields all constraints of problem (46) are hold). The Lyapunov function can be
written as
L(ΞΞΞ(t)),1
2
[
∑Kk=1 Qk(t)
2 +∑Kk=1 Yk(t)
2 +∑Ss=1 Ds(t)
2].
For each time slot t, ∆(ΞΞΞ(t)) denotes the Lyapunov drift, which is given by
∆(ΞΞΞ(t)),E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(t))|ΞΞΞ(t)
].
52
[[
Impact of network queue, virtual queue, and ΛΛΛ︷ ︸︸ ︷
−∑k
(Qk(t)+Yk(t)
)rk(ΛΛΛ(t))
]
1⋆
Impact of SC queue and φφφ︷ ︸︸ ︷
−∑s Ds(t)r(bs)cs (φ (bs)(t))
]
2⋆
+[
Impact of virtual queue, SC queue, and auxiliaries︷ ︸︸ ︷
∑k Yk(t)ϕk(t)+∑s Ds(t)ϕs+M(t)
penalty︷ ︸︸ ︷
−ν f0(ϕϕϕ(t))]
3⋆. (49)
Noting that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive num-
ber a,b, and thus, by neglecting the index t we have:
(max [Qk − rk, 0]+ ak)2 −Q2
k ≤ 2Qk(ak − rk)+ (ak − rk)2,
max [Yk +ϕk − rk, 0]2 −Y 2k ≤ 2Yk(ϕk − rk)+ (ϕk − rk)
2,
max [Ds +ϕs+M − r(bs)cs (t), 0]2 −D2
s ≤ 2Ds(ϕs+M
− r(bs)cs (t))+ (ϕs+M − r
(bs)cs (t))2.
We assume that ϕϕϕk ∈ R and a feasible l for all t and all possible ΞΞΞ(t), thus we have
∆(ΞΞΞ(t))≤ Ψ+∑Kk=1 Qk(t)E
[
ak(t)− rk(t)|ΞΞΞ(t)]
+∑Ss=1 Ds(t)E
[ϕs+M(t)− r
(bs)cs (t)|ΞΞΞ(t)
]
+∑Kk=1 Yk(t)E
[ϕk(t)− rk(t)|ΞΞΞ(t)
]. (48)
Here ∆(ΞΞΞ(t)) ≤ Π, where Π represents the R.H.S of (48), and Ψ is a finite constant
that satisfies Ψ ≥ 12 ∑K
k=1E[(
ak(t)− rk(t))2|ΞΞΞ(t)
]+ 1
2 ∑Kk=1E
[(ϕk(t)− rk(t)
)2|ΞΞΞ(t)]+
12 ∑S
s=1E[(
ϕs+M(t)− rcss (t)
)2|ΞΞΞ(t)], for all t and all possible ΞΞΞ(t). We apply the Lya-
punov drift-plus-penalty technique [77], where the solution of (46) is obtained by mini-
mizing the Lyapunov drift and a penalty from the objective function, i.e.,
min Π−νE[ f0(ϕϕϕ(t))].
Here, the parameter ν is chosen as a non-negative constant to control the optimal mini-
mization solution [77]. Since Ψ is finite, the problem is to minimize the below expres-
sion subject to the convex set hull, given by (49). Note that (49) is decoupled over
user association, user scheduling, and operation mode variables (2⋆), auxiliary variables
(3⋆), and precoder and power allocation variables (1⋆), respectively as in (49). Hence,
the respective variables can be found independently by minimizing the individual term
at each time. Fig. 5 summarizes the relationship among the various subproblems.
53
l(t)?;φ(t)?
Q(t);D(t);Y(t)
V = UT
p(t)(b0)?
Q(t+ 1);D(t+ 1);Y(t+ 1)
Load Balancing
& Operation Mode
Beamforming Design
Power Allocation
Queue Update
Au
xil
iary
Vari
ab
le
Sele
cti
on
pk(t)(b0) = P (b0)=K
'(t)?
Tim
eIn
dic
es
Alg
ori
thm
1
Queue
Update
Pow
erA
llo
cati
on
DL
Tra
nsm
issi
on
MBS
SCs
SUEs
MUEs
CSI report
DL
Tra
nsm
issi
on
Fig. 5. Joint load balancing and interference mitigation algorithm ([23] c©2017 IEEE).
3.4.1 Joint load balancing and operation mode selection
First, the problem of joint load balancing and FD-enabled SC operation mode selection
in (2⋆) is cast as the minimization problem below.
minl,φφφ
−∑Kk=1 Ak(t) log
(1+ l
(b0)k (t)
p(b0)k
(1−τ2k )
σ 2+∑Ss=1 φ (bs)k
(bs)k
)
−∑Ss=1 Ds(t) log
(1+φ (bs)(t)
l(bs)cs (t)p
(bs)cs
σ 2+∑Ss′ 6=s φ (bs′ )k
(bs′ )cs′
)(50a)
subject to l(bs)j (t) ∈ 0,1,∀ j ∈ K ∪C , ∀ bs ∈ B, (50b)
φ (bs)(t) ∈ 0,1,Ntxs (t)≤ Nau
s ,∀ s ∈ S , (50c)
∑Ss=0 l
(bs)m (t)≤ 1,∀ m ∈ M ,
∑Mm=1 l
(bs)m (t)+ l
(bs)cs (t) = Ntx
s (t), (50d)
(39),rk(t) ∈ R ,
∑Kk=1 l
(b0)k (t)+∑S
s=1 Ntxs (t)≤ N, (50e)
∑Kk=1 ∑S
s=1 l(b0)k (t)φ (bs)(t)k
(bs)k (t)≤ ko, (50f)
where Ak(t) = Qk(t)+Yk(t). This problem is a non-convex program with binary vari-
ables. It turns out this problem has a hidden convexity structure and the non-convex
terms can be iteratively approximated by its convex upper bound via an iterative SCA
method. The motivations for utilizing the SCA method are due to (i) its low complexity
54
and fast convergence [85, Lemma 3.5] and (ii) the obtained solution which yields many
relaxed variables are close to zero or one [86]. In this regard, we convexify this problem
to find a sub optimal solution. First, we relax the binary constraints (50b) and (50c) to
linear constraints as continuous variables. Secondly, at each iteration i the non-convex
constraint (50f) is approximated by upper convex approximation, i.e.,
K
∑k=1
S
∑s=1
(δ(i)ks (l
(b0)k (t))2
2+
(φ (bs))2(t)
2δ(i)ks
)k(bs)k (t)−ko ≤ 0,
for every fixed positive value δ(i)ks . Finally, instead of minimizing the non-convex objec-
tive function (50a) we convert it into a convex function by the followings. We minimize
its upper bound by replacing the denominators, i.e., σ2+∑Ss=1 φ (bs)k
(bs)m with the largest
bound, i.e., σ2+k0. Due to the interference constraint (50f), we obtain the upper bound
as below
−K
∑k=1
Ak(t) log(1+
l(b0)k (t)p
(b0)k (1− τ2
k )
σ2 +k0
)
−S
∑s=1
Ds(t) log(1+φ (bs)(t)
l(bs)cs (t)p
(bs)cs
σ2 +k0
).
Using a similar approach to convexifying the interference constraint (50f), we convexify
the second part of the objective function which still remains non-convex. We denote the
lower bound of the SINR of UE served SC bs as γbs(t). Let us set l(bs)cs (t),
l(bs)cs (t)p
(bs)u
σ 2+k0.
Then we have:
γbs(t)≤ φ (bs)(t)l(bs)cs (t), ∀ s ∈ S , (51)
by introducing the new slack variable ρ2s (t), (51) is equivalent to:
1
4
(φ (bs)(t)− l
(bs)cs (t)
)2+ρ2
s (t)≤1
4
(φ (bs)(t)+ l
(bs)cs (t)
)2, (52)
and γbs(t)≤ ρ2s (t), ∀ s ∈ S . (53)
where the constraint (52) holds a form of the second-order cone inequalities (SOC),
while the RHS of the set of constraints in (53) are still non-convex, which can be ap-
proximated by using the iterative SCA method [85]. We rewrite the constraint (53)
as
γbs(t)≤ ρ(i)2s (t)+ 2ρ
(i)s (t)(ρs(t)− ρ
(i)s (t)), ∀ s ∈ S , (54)
55
Algorithm 3.1 Joint load balancing and operation mode algorithm ([23] c©2017 IEEE)
1: Initialization i := 0, δ(i)ks , ρ
(i)s := randomly positive that satisfy all constraints.
2: repeat
3: Solve (55) with δ(i)ks , ρ
(i)s to get optimal value ΛΛΛo⋆ = l⋆,φφφ⋆.
4: Update ΛΛΛo(i) := ΛΛΛo⋆ and δ(i+1)ks := φ (bs)(i)
l(b0)(i)k
; ρ(i+1)s := ρ
(i)s ; i := i+ 1.
5: until Convergence
where at iteration i+ 1, we update ρ(i+1)s (t) such that ρ
(i+1)s (t) = ρ
(i)s (t). Hence, the
optimal value of ΛΛΛo is given by
minl,φφφ
−∑Kk=1 Ak(t) log
(1+ l
(b0)k (t)
p(b0)k
(1−τ2k )
σ 2+k0
)−∑S
s=1 Ds(t) log(1+ γbs(t)
)
(55a)
subject to l(bs)j (t) ∈ [0,1],∀ j ∈ K ∪C ,∀ bs ∈ B, (55b)
φ (bs)(t) ∈ [0,1],Ntxs (t)≤ Nau
s ,∀ s ∈ S , (55c)
(50d),(50e),(52),(54), (55d)
∑Kk=1 ∑S
s=1
(δ(i)ks
(l(b0)k
(t))2
2+ (φ (bs))2(t)
2δ(i)ks
)
k(bs)k (t)−ko ≤ 0. (55e)
At each time slot t, the approximated problem (55) is iteratively solved as in Algo-
rithm 3.1. We numerically observe that the SCA-based Algorithm 3.1 converges quickly
within a few iterations and yields a continuous relaxation solution of many user asso-
ciation and operation mode variables close or equal to binary. To ensure that all the
users will be served, when performing Algorithm 3.1, to find the best scheduled users
each user is assumed to receive the same transmit power. Moreover, the scheduling will
be performed in a long-term period, while the power allocation problem is executed
in a short-term period. Since the objective function of the problem (55) is a maxi-
mum weighted matching problem with respect to linear or square function, we use a
low-complexity binary search algorithm [87] to obtain the final solutions with lower
dimensions. Let K1 = j,s|l(bs)⋆j ,φ (bs)⋆ = 1, Kuct = j,s|ξ ≤ l
(bs)⋆j ,φ (bs)⋆ ≤ 1, and
K0 = j,s|l(bs)⋆j ,φ (bs)⋆ ≤ ξ denote the set of selected variables, the set of uncertain
variables, the set of removed variables, respectively, where ξ is some small threshold.
First, we determine the set K1, Kuct, and K0 based on ξ . Then, we consider to select
among the uncertain variables in Kuct. By sorting Kuct in a descending order, a loop
starts by selecting one by one variable based on their largest weights according to the
56
objective function. We set the value uncertain variable to 1, and add it to K1, if it sat-
isfies the antennas (55d) and interference (55e) constraints. If it does not satisfy the
constraints, we add it to K0. The loop stops when it reaches the last uncertain variable
or the antennas constraint is over. Finally, K1 is kept, while K0 and Kuct are removed.
3.4.2 Auxiliary variable optimization
The optimal auxiliary variable from (3⋆) is computed by
minϕϕϕ(t)
∑Kk=1 Yk(t)ϕk(t)+∑S
s=1 Ds(t)ϕs+M(t) (56a)
−ν ∑Kk=1 ωk(t) f (ϕk(t)) (56b)
subject to ϕk(t)≤ amaxk (t). (56c)
Since the above optimization problem is convex, let ϕ∗k (t) be the optimal solution ob-
tained by the first order derivative of the objective function of (56). With a logarithmic
utility function, we have:
ϕ∗k (t) =
νωk(t)Yk(t)
if k ≤ M,
νωk(t)Yk(t)+Dk−M(t) otherwise.
The optimal auxiliary variable is minϕ∗k (t),a
maxk (t).
3.4.3 Interference mitigation and power allocation
For given scheduled users, the precoder U is found by solving (31). Finally, prob-
lem (46) is decomposed to find the transmit power p(b0)k (t) from (1⋆) that is minimized:
minp(t)
−∑Kk=1 Ak(t)rk(p(t)) (57a)
subject to1
N∑K
k=1
p(b0)k
(t)
Ωk(t)−P(b0) ≤ 0,
p(b0)k (t)≥ 0,∀ k ∈ K .
The objective function (57) is rewritten as n(p(t))=−∑Kk=1 Ak(t) log
(1+ p
(b0)k (t)nk(t)
),
where nk(t) =l(b0)k
(t)(1−τ2k )
σ 2+∑Ss=1 φ (bs)(t)k
(bs)k (t)
. The objective function is strictly convex for
57
p(b0)k (t) ≥ 0,∀k ∈ K , and the constraints are compact. Hence, the optimal solution of
p⋆(t) exists, the Lagrangian function is written as L(p(t),µ0) = n(p(t)) + µ0g(p(t)),
where µ0 ≥ 0 is the KKT multiplier. The KKT conditions are
∇n(p(t))T + µ01N ∑K
k=11
Ωk(t)= 0. (58)
µ0
(1N ∑K
k=1
p(b0)k
(t)
Ωk(t)−P(b0)
)
= 0. (59)
1
N∑K
k=1
p(b0)k
(t)
Ωk(t)−P(b0) ≤ 0, −p(t)≤ 0,µ0 ≥ 0. (60)
Here, ∇n(p(t))T =(n′(p(b0)1 (t)), . . . ,n′(p
(b0)K (t))) where n′(p
(b0)k (t))= −Ak(t)nk(t)
1+p(b0)k
(t)nk(t). For
µ0 6= 0, from (58), obtaining
p(b0)k (t) = max[
AkNΩk(t)
µ0
− 1
nk(t),0], (61)
from (59) and (61) we derive µ0. Finally, the optimal value of pk(t)(b0)⋆ is obtained
with (61).
3.4.4 Queue update
Update the virtual queues Yk(t) and Ds(t) according to (47) and (45), and the actual
queue Qk(t) in (44).
Theorem 3.2 is provided to show the performance analysis of the network utility
maximization based on Lyapunov framework and to prove that the queues are stable.
Theorem 3.2. [Optimality] Assume that all queues are initially empty. For arbitrary
arrival rates, the operation mode and load balancing are chosen to satisfy (49) and
the rate regime. For a given constant χ ≥ 0, the network utility maximization with any
ν > 0 provides the following utility performance with χ − approximation
f0 ≥ f ∗0 − Ψ+ χ
ν,
where f ⋆0 is the optimal network utility over the rate regime. While the strong stability
of the virtual queues and the network queues is given by
Qk(t)≤ νωk(t)πk + 2amaxk , ∀t ≥ 0, ∀k ∈ K ,
Yk(t)≤ νωk(t)πk + amaxk , ∀t ≥ 0, ∀k ∈ K ,
Ds(t)≤ νωs+M(t)πs+M + amaxs+M, ∀t ≥ 0, ∀s ∈ K .
Proof: The proof can be found in [77] and is omitted for the sake of brevity.
58
3.5 Numerical results
In this section Monte Carlo simulations are carried out in order to evaluate the system
performance of our proposed algorithm. To solve Algorithm 3.1, we use the YALMIP
toolbox [88] to model the optimization problem with SDPT3 [89] or MOSEK [90] as an
internal solver. For the simulation, we consider the proportional fairness utility function,
i.e., f (rk) = log(10−4 + rk) [91]. We denote our proposed user association algorithms
for the HetNet (resp. Homogeneous network) as HetNet-Hybrid (resp. HomNet [41]).
Here, HomNet [41] refers to when the MBS serves both MUEs and SUEs without
SCs. We compare our proposed algorithm with HomNet [41] and with the previous
work [74] (HetNet-Closed Access [74]). The HetNet-Closed Access [74] case consid-
ers only a joint in-band scheduling and interference mitigation algorithm with a fixed
user association scheme (SCs are configured in closed subscriber group). The network
performance is evaluated under the impact of the number of SCs per km2, the number
of MBS antennas N, and the MBS transmit power levels P(b0) at low and high frequency
bands. We provide the convergence behaviour of the proposed method and validation
of the approximation method.
3.5.1 Simulation environments
Consider a HetNet scenario, where an MBS is located at the center of a square area,
MUEs are randomly deployed within the coverage of the MBS (the minimum MBS-
MUE distance is 35 m). The SCs are uniformly distributed and one SUE per each SC
is considered. The number of antennas at SCs Ns is greater than two, while we as-
sume each SC can serve up to Naus = 2 UEs (including its own SUE). The path loss is
modeled as a distance-based path loss with line-of-sight (LOS) model for urban envi-
ronments [49, 92, 93]. To make the performance evaluation we first assume that the
probability of obtaining LOS is very high, while the effect of other channel models is
studied later. The FD interference threshold ko is set to 5× 10−3 and the RZF param-
eter is ζ = 10−2. The data arrivals follow the Poisson distribution with a mean rate of
1 Gbps, 100 Mbps, and 20 Mbps for 28 GHz, 10 GHz, and 2.4 GHz, respectively. The
parameter settings are summarized in Table 1.
59
Table 1. Parameter settings ([23] c©2017 IEEE)
Parameter Value
Maximum transmit power of MBS P(b0) 41 dBm
Maximum transmit power of SC 30 dBm
Channel quality τ 0.1
RZF parameter ζ 10−2
FD interference threshold ko 5× 10−3
SC antenna gain 5 dBi
Number of antennas at SC Ns + 1
Lyapunov parameter ν 2× 106
Path loss model
LOS @ 28 GHz 61.4+ 20log(d), bandwidth:1 GHz
LOS @ 10 GHz 55.25+ 18.5log(d), bandwidth:100 MHz
LOS @ 2.4 GHz 17+ 37.6log(d), bandwidth:20 MHz
3.5.2 Ultra-dense small cells environment
To show the impact of the network density, the average UE throughput (avgUT) and the
cell-edge UE throughput (cell-edge UT) as a function of the number of SCs are shown
in Fig. 6 and Fig. 7, respectively. The maximum transmit power of the MBS and SCs is
set to 41 dBm and 32 dBm, respectively. In Fig. 6 and Fig. 7, the simulation is carried
out in an asymptotic regime where the number of BS antennas and the network size
(MUEs and SCs) grow large with a fixed ratio [43]. In particular, the number of SCs
and the number of SUEs are both increased from 36 to 1000 per km2, while the number
of MUEs is scaled up with the number of SCs, such that M = 1.5× S. Moreover, the
number of the transmit antennas at MBS and SCs is set to N = 2×K and Ns = 6, respec-
tively. We recall that when adding SCs we also add one SUE per one SC that increases
the network load. Here, the total number of users is increased while the maximum
transmit power is fixed, and thus, the per-user transmit power is reduced by 1/K, which
reduces the per-UE throughput. Even though the number of MBS antennas is increased
by K, the performance of the massive MIMO system reaches its limit as the number
of antennas goes to infinity. It can be seen that with an increasing network load, our
proposed algorithm HetNet-Hybrid outperforms baselines (with respect to the avgUT
60
36100 200 300 400 500 600 700 800 900 1000
Number of Small Cells per km2
0
0.3
0.6
0.9
1.2A
chie
vab
le a
vgU
T [
Gbps]
28 GHz
HetNet-Hybrid
HetNet-Closed Access
HomNet
Fig. 6. Achievable avgUT versus number of the small cells per km2, S, when scaling accord-
ing to K = 2.5×S, N = 2×K ([23] c©2017 IEEE).
and the cell-edge UT) and the performance gap of the cell-edge UT is largest (5.6×)
when the number of SC per km2 is 350, and it is small when the number of SCs per km2
is too small or too large. The reason for this is that when the number of SCs per km2 is
too small, the probability of an MUE to find a open access nearby-SC to connect is low.
By increasing the number of SCs per km2 MUEs are more likely to connect with open
access nearby-SCs to increase the cell-edge UT. However, when the number of SCs per
km2 is too large, the cell-edge UT performance of HetNet-Hybrid is close to that of
HetNet-Closed Access [74] due to the increased FD interference. Moreover, Fig. 6 and
Fig. 7 show that the combination of massive MIMO and FD-enabled SCs improves the
network performance; for instance, HetNet-Hybrid and HetNet-Closed Access [74]
outperform HomNet [41] in terms of both the avgUT and the cell-edge UT. Our results
provide good insights for network deployment: for a given target UE throughput, what
is the optimal number of UEs to schedule and what is the optimal/maximum number of
SCs to be deployed?
61
36100 200 300 400 500 600 700 800 900 1000
Number of small cells per km2
0
0.1
0.2
0.3
0.4
0.5
Ach
ievab
le c
ell-
edge
UT
[G
bps]
28 GHz
HetNet-Hybrid
HetNet-Closed Access
HomNet
Fig. 7. Achievable cell-edge UT versus number of small cells per km2, S, when scaling
K = 2.5×S, N = 2×K ([23] c©2017 IEEE).
3.5.3 Wireless backhaul impact for different transmit power levels
We also report the avgUT and the total network utility (TNU) along with the average
queue length (“dashed line") as a function of the MBS maximum transmit power at
different frequency bands (28 GHz, 10 GHz, and 2.4 GHz) in Fig. 8 and Fig. 9, re-
spectively. In particular we consider the number of SCs to be S = 45 per km2, and
the number of MUEs M to be twice the number of SCs S. The number of MBS anten-
nas is set to N = K, while the number of antennas at SCs Ns + 1 is set to 5. Due to
the insufficient number of antennas at the MBS to simultaneously serve all the MUEs
and SCs and to alleviate the interference, offloading from the MBS to SCs helps to
associate more UEs to the BSs. In this case the TNU is low, since the number of the
MBS antennas is reduced by half as compared to the impact of MBS antennas cases.
As decreasing the maximum transmit power at the MBSs, HetNet-Hybrid outper-
forms HetNet-Closed Access [74], there is an inflexion point where the performance of
HetNet-Hybrid is close to that of HetNet-Closed Access [74] when the transmit power
level is 25 dBm, 31 dBm, and 37 dBm at 28 GHz, 10 GHz, and 2.4 GHz, respectively. It
can be observed that at higher frequency bands FD-enabled SCs work better in the open
access mode than closed access mode under the same transmit power budget. When the
62
43 40 37 34 31 28 25 22
P(b
0) [dBm]
0
0.1
0.2
0.3
0.4
0.5
0.6
Ach
ievab
le a
vgU
T [
Gbps]
28 GHz
HetNet-Hybrid
HetNet-Closed Access
HomNet
43 40 37 34 31 28 25 220
5
[Mb
ps]
10 GHz
43 40 37 34 31 280
0.5
[Mb
ps]
2.4 GHz
Fig. 8. Achievable avgUT versus P(b0) at 28, 10, and 2.4 GHz, when S = 45 per km2, K = 3×S,
N = K ([23] c©2017 IEEE).
43 40 37 34 31 28 25 22
P(b
0) [dBm]
0
1
2
3
4
5
6
To
tal
net
wo
rk u
tili
ty [
Gb
ps]
28 GHz
HetNet-Hybrid
HetNet-Closed Access
HomNet
0
50
0
10
20
30
40
50
60
Av
erag
e Q
ueu
e L
eng
th [
Gb
]
0
50
43 40 37 34 31 28 25 220
50
100
[Mb
ps]
10 GHz
4
5
6
4
6
4
6
43 40 37 34 31 280
5
[Mb
ps]
2.4 GHz
Fig. 9. The TNU (“solid line") and network queue length (“dashed line") versus P(b0) at 28, 10,
and 2.4 GHz, when S = 45 per km2, K = 3×S, N = K ([23] c©2017 IEEE).
maximum MBS transmit power is too small, the performance of HetNet-Hybrid and
HetNet-Closed Access [74] is very closed to that of HomNet [41].
63
1 2 3 4 5 6 7 8 9 10
Number of iterations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Cu
mu
lati
ve
per
cen
t
Cumulative distribution of number of iterations
Algorithm I : HetNet-Hybird
Fig. 10. The CDF of the convergence of Algorithm 3.1 ( [23] c©2017 IEEE).
3.5.4 Convergence
In Fig. 10 we show the convergence behaviour of our approximated algorithm based
on the SCA method when deploying our HetNet-Hybrid algorithm. The convergence
analysis is provided in Appendix 1.1. Unlike other works, we plot the cumulative distri-
bution of the number of iterations at which the Algorithm 3.1 converges for all t. We
observe that the probability that the number of iterations takes on a value less than or
equal to 4 is 90%, which implies that our proposed algorithm only needs few iterations
to converge.
We then validate the accuracy of the closed-form expression for the data rate by
comparing the Ergodic sum rate R, which is obtained by using the SINR from (35)
and (36) from simulations of i.i.d. Rayleigh block-fading channels, to the approximated
sum rate R, which obtained by using SINR from (41) and (42). The sum rate is defined
as the total sum of all user data rates. We define the absolute error as R−RR
, then we plot
the absolute error versus the number of MBS antennas, while the number of users is
fixed to K = 12. As can be seen in Fig. 11, the absolute error decreases as increasing
number of MBS antennas. This means that closed-form expressions are more accurate
when number of the MBS antennas is higher than number of users, i.e., N ≫ K. The
impact of the Lyapunov parameter ν on the achievable average network utility and
queue backlog has been shown in our previous work [74]. It has been observed that
64
15 30 45 60 75 90 105
N
0
0.005
0.01
0.015
Ab
solu
te E
rro
r m = 0.1
Absolute Error of the sum rate approximation
compared to the Ergodic sum rate
Fig. 11. Validation of the approximation of the closed-form expression, when K = 12 and
K = 3×S ( [23] c©2017 IEEE).
the network utility increases with O(1/ν), while the network backlog linearly increases
with O(ν). Hence, choosing the value of ν will result in an [O(1/ν),O(ν)] utility-queue
backlog tradeoff, which leads to a utility-latency tradeoff [72].
3.6 Summary and discussion
This chapter proposed an integrated access-backhaul (IAB) scheme in which a network
utility maximization problem was studied for solving the problem of joint load balanc-
ing and interference mitigation subject to the backhaul dynamic and network stability
in the presence of imperfect CSI. By using stochastic optimization, the studied prob-
lem was then decoupled into dynamic scheduling of MUEs, backhaul provisioning of
in-band FD-enabled SCs, and offloading UEs to in-band FD-enabled SCs as a function
of interference, number of antennas, and backhaul loads. Via numerical results, the
findings demonstrate that even at lower frequency band the performance of open access
small cells is close to that of closed access at some operating points, the open access
full-duplex small cell still yields higher gain as compared to the closed access at higher
frequency bands. Moreover, the open access full-duplex small cells outperform and
achieve 5.6× gain in terms of cell-edge performance compared to the closed access
ones in ultra-dense networks with 350 small cell base stations per km2.
65
This chapter made some ideal assumptions about the perfect SIC, perfect channel reci-
procity, and static UEs during each coherence time. In addition, when deploying mas-
sive antennas, a general analytical mmWave channel model was adopted, so a more-
specific mmWave channel model should be considered in future work. Moreover, hy-
brid beamforming should be taken into account to reduce the number of RF chains and
hardware complexity.
Furthermore, this chapter considered a two-hop transmission scheme, and the problem
of multi-hop multi-path transmission should be investigated as the network becomes
denser and the communication range is shorter in mmWave environments. In this regard,
the next chapter will focus on the path selection and rate allocation over multi-hop
mmWave transmissions.
66
4 Self-backhauled multi-hop architecture
In this chapter, the author extends the studied scenario in chapter 3 to self-backhauled
multi-hop transmissions and answers the research question Q2. In particular, the author
proposes a new system design which exploits multi-hop transmission, multiple antenna
diversity, mmWave bandwidth, and dynamic PS with traffic splitting techniques to over-
come the severe path loss and mitigate the impact of blockages in mmWave networks.
4.1 Main contributions and related work
The main contributions of our work are listed as follows:
– A joint PS and RA optimization framework for multi-hop multi-path scheduling is
formulated, whereby self-backhauled FD SCs act as relay nodes to forward data from
the macro BS to the intended UEs. A multi-hop transmission technique enables reli-
able mmWave communications over a long distance. However, there is a probability
that the mmWave signal could be blocked by the human body. Hence, we also intro-
duce a multi-path selection scheme in which the transmitter smartly selects a subset
of the best paths from the possible paths.
– In the proposed system design, leveraging a massive array antenna, hybrid beam-
forming is adopted to provide a Gbps data rate at mmWave bands. In addition, we
impose a probabilistic latency bound to ensure URLLC with a high data rate. For
this purpose, the studied problem is cast as a network utility maximization (NUM),
subject to a bounded latency constraint and network stability.
– Leveraging a stochastic optimization framework [77], we decouple the studied prob-
lem into two sub-problems, namely PS and RA. By utilizing the benefits of historical
information, a reinforcement learning (RL) is used to build an empirical distribution
of the system dynamics to aid in learning the best paths to solve PS [80, 94]. Therein,
the concept of regret strategy is employed, defined as the difference between the av-
erage utility when choosing the same paths in previous times, and its average utility
obtained by constantly selecting different paths [80, 94]. The premise is that regret
is minimized over time so as to choose the best paths. Second, to solve a non-convex
RA sub-problem, we apply the concept of successive convex approximation (SCA)
method due to its low complexity and fast convergence [85].
67
– The proposed approach addresses the following fundamental questions: (i) over
which paths should the traffic flow be forwarded? and (ii) what is the data rate
per flow/sub-flow?, while ensuring a probabilistic latency constraint, and network
stability. By using a mathematical analysis, a comprehensive performance of our
proposed stochastic optimization framework is scrutinized. It is shown that there ex-
ists an [O(1/ν),O(ν)] utility-queue backlog trade-off, which leads to utility-latency
balancing [77], where ν is a control parameter. In addition, a convergence analysis
of both the two sub-problems is studied. Finally, the performance of the proposed
solution is validated in an extensive set of simulations.
Related work
A tractable rate model was proposed to characterize the rate distribution in self-backhauled
mmWave networks [95]. Few efforts have been made to study the mmWave network
operation regime, noise-limited or interference limited, depending on the density of in-
terferers, transmission strategies, or channel propagation models [96, 97, 98]. A large
body of research work has attempted to study the joint RA, congestion control, routing,
and scheduling for multi-hop wireless networks, incorporating the proportional latency
based on the sum of queue backlogs [99], applying the concept of back-pressure algo-
rithm [100, 101], exploiting the potential of multiple gateways [102].
The authors in [103] considered a problem of joint scheduling and congestion control
in a multi-hop mmWave network using a NUM framework in which the proposed solu-
tion is verified under three interference models, namely graph-based actual interference,
free-interference (IF), and the worse-case interference. [103] also showed that the IF
model provides very tight upper bound for a realistic system evaluation in mmWave
cellular networks as long as the optimal throughput can be guaranteed. However, [103]
was concerned only with the network capacity maximization and single path streaming,
a tight latency and reliable constraint should be investigated together with dynamic path
diversity.
Moreover, the authors in [104] designed a multi-hop wireless backhaul scheme with
latency guarantee in which a link activation scheme was proposed to avoid interference
and minimize the latency. A rate allocation problem to minimize the application layer
video/end-to-end distortion subject to quality of service constraints (latency, backhaul)
was considered in [105, 106] for multi-path networks. However, other important as-
pects in 5G networks such as low-latency and high-reliability are generally ignored
68
when maximizing the network performance (capacity, energy efficiency and spectral
efficiency) [39, 95, 107].
A recent work in [108] has studied the multi-hop relaying transmission challenges for
mmWave systems, aiming at maximizing overall network throughput, and taking ac-
count of traffic dynamics and link qualities. In our work, we also study the NUM
optimization problem, while considering channel variations and network dynamics. An-
other recent work in [109] has addressed the problem of traffic allocation for multi-hop
scheduling in mmWave networks to minimize the end-to-end latency, in which the mini-
mum latency is derived based on the channel capacity to determine the portions of traffic
over channels such that all traffic fractions arrive simultaneously at the destination.
In addition, the problem of PS and multi-path congestion control for data transfers was
studied in [110] in which the aggregate utility is increased as more paths are provided.
One important suggestion is to re-select randomly from the set of paths and shift be-
tween paths with higher payoff. However, splitting data into too many paths leads to
increased signaling overhead and causes traffic congestion. While interesting, the pre-
ceding works do not address the problem of high-data rate, low-latency and reliability
communication in multi-path mmWave networks. In this respect, our proposed solu-
tion is to select the best paths to maximize the network throughput, subject to a latency
bound violation constraint with a tolerable probability (reliability). Our previous work
[72] studied URLLC-centric mmWave networks for single hop transmission, and [23]
proposed an integrated access and backhaul architecture for two-hop relay without con-
sidering the latency-sensitive constraint. Hence, in this work we extend to the multi-hop
wireless backhaul scenario, and study a joint PS and RA problem focusing on URLLC.
Via mathematical analyses and extensive simulations, we provide insights into the per-
formance analysis of our proposed algorithm and the convergence characteristics of the
learning algorithm and the SOCP based iterative method.
The rest of the chapter is organized as follows. Section 4.2 describes the system
model and Section 4.3 provides the problem formulation for a joint PS and RA opti-
mization. Section 4.4 introduces a stochastic optimization framework to decouple our
studied problem, whereby two practical solutions are proposed. In Section 4.5, we
provide extensive numerical results to compare again other baselines. Conclusions are
drawn in Section 4.6.
69
Macro BS
Self-backhauled SCBS
UE 1
UE 2
Traffic aggregation
Route 1
Route 2
Route 4
Route 3
Traffic split
Full-duplex communication
UE K.....
UE k
One - hop transmission range
Fig. 12. Illustration of 5G multi-hop self-backhauled mmWave networks ([24] c©2019 IEEE).
4.2 System model
4.2.1 Network model
Let us consider a downlink (DL) transmission of a multi-hop heterogeneous cellu-
lar network (HCN) which consists of a macro base station (MBS), a set of B self-
backhauled small cell base stations (SCBSs), and a set K of K user equipments (UEs)
as shown in Fig 12. Let B = 0,1, · · · ,B denote the set of all BSs in which index 0
refers to the MBS. The in-band wireless backhaul is used to provide backhaul among
BSs [74]. A full-duplex (FD) transmission protocol is assumed at SCBS with perfect
self-interference cancellation (SIC) capabilities [111]. Each BS b is equipped with
Nb transmitting antennas and Rb radio frequency (RF) chains, such that 1 ≤ Rb ≤ Nb
[112, 113, 114]. Similarly, each UE k is equipped with Nk transmitting antennas and
Rk RF chains, such that 1 ≤ Rk ≤ Nk, Rk ≤ Rb, and Nk ≪ Nb. The network topology is
modeled as a directed graph G = (N , L), where N = B ∪K represents the set of nodes
including BSs and UEs. L = (i, j)|i ∈B, j ∈ N denotes the set of all directional edges
(i, j) in which nodes i and j are the transmitter and the receiver, respectively.
We consider a queuing network operating in discrete time t ∈ Z+. There are F inde-
pendent data flows at the MBS. Each data traffic is destined for only one UE, whereas
one UE can receive up to Rk multiple data streams, i.e., F ≥K. The number of total data
streams at the MBS is no greater than the number of RF chains, such that F ×Rk ≤ Rb
70
Table 2. Notations for system model ([75, 24] c©2019 IEEE).
Notations Descriptions
B,K Sets of (B+ 1) base stations, K user equipments
N = B ∪K Set of nodes including BSs and UEs
L Set of all directional edges (i, j)|i ∈ B, j ∈ N
F Set of F flows
Z f Set of Z f disjoint paths observed by flow f
Zmf Disjoint path state/table m observed by flow f
N(o)
i Set of the next hops from node i
i(I)f Previous hop of flow f to BS i
i(o)f Next hop of flow f from BS i
pf
(i, j) Transmit power of node i to node j for flow f
zmf = 1 Path m is used to send data for flow f
πmf Probability of choosing path m for flow f
[113, 114]. Hereafter, we refer to data traffic as data flow. We use F to represent the set
of F data flows/sub-flows. The MBS can split each flow f ∈ F into multiple sub-flows
which are delivered via disjoint paths and aggregated at UEs [115, 116].
We assume that there exits Z f number of disjoint paths from the MBS to the UE for
flow f . For any disjoint path m ∈
1, · · · ,Z f
, we denote Zm
f as the path state, which
contains all path information such as topology and queue states for every hop. Let
Z f = Z1f , · · · ,Zm
f , · · · ,ZZ f
f denote the path states/tables observed by flow f . We use
the flow-split indicator vector z f =(
z1f , · · · , z
Z f
f
)
to denote how the MBS splits flow f ,
where zmf = 1 means path m is used to send data for flow f ; otherwise, zm
f = 0. Let N(o)
i
denote the set of next hops from node i via a directional edge. We denote the next hop
and the previous hop of flow f from and to BS i as i(o)f and i
(I)f , respectively. Table 2
shows the notations, used throughout this chapter.
4.2.2 mmWave MIMO channel model
Due to limited spatial scattering in mmWave MIMO propagation [10, 114], we assume
that there are L(i, j) clusters between transmitter i and receiver j, such that L(i, j) ≪
71
r(i, j) = EH,p
wlog
1+p(i, j)|c†
(i, j)HT
(i, j)v(i, j)|2
∑i′ 6=i ∑j′∈N
(o)
i′p(i′, j′)|c†
(i, j)HT
(i′, j)v(i′, j)|2 +σ2j ‖c(i, j)‖2
.
(64)
min(Ni,N j). The channel matrix H(i, j) of link (i, j) can be modelled as [114, 117, 118]
H(i, j) =
√
Ni ×N j
L(i, j)
L(i, j)
∑l=1
h(i, j)(l)A j(α j,l)A†i (αi,l), (62)
where h(i, j)(l) denotes the small-scale fading coefficient of the cluster lth. α j,l and αi,l
denote the azimuth angles of arrival and departure, respectively. Here, Ai(αi,l) and
A j(αi,l) represent the transmitter and receiver response vectors, respectively (Please
refer [117, 118] for more details). We denote H =
H(i, j)|(i, j) ∈ L
as the network
channel matrix.
4.2.3 Transmission rate
We denote pf
(i, j)as the transmit power of node i assigned to node j for flow f , such that
∑ f∈F ∑j∈N
(o)i
pf
(i, j)≤ Pmax
i , where Pmaxi is the maximum transmit power of node i. We
have the following power constraint
P =
pf
(i, j)≥ 0, i, j ∈ N ,
∣∣∣ ∑
f∈F∑
j∈N(o)
i
pf
(i, j)≤ Pmax
i
. (63)
Vector p = (pf
(i, j)|∀i, j ∈ N ,∀ f ∈ F ) denotes the transmit power over all flows.
Based on the hybrid beamforming and combining model [113, 114], with c(i, j) ∈CN j×1 as the RF combining and baseband equalizer and v(i, j) ∈ CNi×1 as hybrid ana-
log/digital precoding, the Ergodic achievable rate5 r(i, j) at the receiver j from the trans-
mitter i can be calculated as (64). Here p(i, j) is the transmit power from the transmitter
i assigned to the receiver j, and the thermal noise of receiver j is η j ∼ CN(
0,σ2j
)
with
a variance of σ2j . In addition, w denotes the system bandwidth of the mmWave fre-
quency band.
5Note that we omit the beam search/tracking time, since it can be done fast and is negligible compared to the
transmission time [119]. Due to the disjoint path assumption and directional beamforming, the interference
associated to transmissions from transmitter i to other receivers j′, received at j, is assumed to be negligi-
ble or can be mitigated by designing the two-layer precoder at the transmitter i [23, 25]. For the sake of
simplification, the impact of this interference is left for future work.
72
r(i, j) = EH,p
wlog
1+
p(i, j)g(t)(i, j)
g(s)(i, j)
g(r)(i, j)
∑i′ 6=i ∑j′∈N
(o)
i′p(i′, j′)g
(t)(i′, j)g
(s)(i′, j)g
(r)(i′, j)+σ2
j
. (65)
As studied in [120], the previous works on mmWave hybrid beamforming are mainly
focused on the physical layer or signal processing aspects [112, 113, 114, 121]. The
authors in [120] developed an accurate analytical model that captures the essence of
mmWave hybrid beamforming, while tractable enough to analyze the throughput-delay
performance. In our work, we adopt the model in [120] to formulate the network utility
maximization subject to the congestion control and network stability. In particular, let
g(t)(i, j)
and g(r)(i, j)
denote the transmitter and receiver analog beamforming gain at the trans-
mitter i and the receiver j, respectively. In addition, we use ω(t)(i, j)
and ω(r)(i, j)
to represent
the angles deviating from the strongest path between the transmitter i and the receiver
j. Also, let θ(t)(i, j) and θ
(r)(i, j) denote the beamwidth at the transmitter i and the receiver
j, respectively. We adopt the widely used antenna radiation pattern [117, 120, 122] to
determine the beamforming gain as
g(i, j)(ω(i, j),θ(i, j)
)=
2π−(2π−θ(i, j))Γ
θ(i, j), if |ω(i, j)| ≤
θ(i, j)2
,
Γ, otherwise,
where 0 < Γ ≪ 1 is the side lobe gain. After the beam alignment is done, the receiver
sends the pilot sequences to the transmitter. The transmitter estimates the channel and
precodes signals, throughout this paper, the effective data rate of link (i, j) r(i, j) is calcu-
lated as (65) in which g(s)(i, j)
denotes the spatial channel gain of link (i, j) [117, 118, 122].
For a given channel state and transmit power, the data rate in the edge (i, j) over
flow f can be posed as a function of channel state and transmit power, i.e., r f(i, j) (H, p),
such that ∑ f∈F r f(i, j) = r(i, j). We denote r = (r f
(i, j)|∀i, j ∈ N ,∀ f ∈ F ) as a vector of
data rates over all flows.
Note that after the beam-searching and alignment are done [117, 122, 123, 124]
the receiver broadcasts pilot sequences to the transmitters, each transmitter estimates
the channel to the corresponding receiver and precodes transmit signal in the DL. With
multiple N j antennas and R j RF chains, each receiver is capable of receiving multiple
data streams from different transmitters using either the main beam or the side lope
beam. We assume that the traffic split and aggregation are done ideally, the multiple
data streams can be transmitted via different paths.
73
4.2.4 Network queues
Let Qif (t) denote the queue length at a BS i at time slot t for flow f . The queue length
evolution at the MBS i = 0 is
Qif (t + 1) =
[
Qif (t)−
Z f
∑m=1,i
(o)f ∈Zm
f
r f(i,i
(o)f )(t), 0
]++ a f (t), (66)
where a f (t) is the data arrival at the MBS during slot t, which is i.i.d. over time with a
mean value a f and is bounded by a f (t)≤ amaxf < ∞. Due to the disjoint paths, for each
flow f the incoming rate from the previous hop i(I)f at the SCBS i is either from another
SCBS or the MBS, and thus, the queue evolution at the SCBS i = 1, · · · , B is given
by
Qif (t + 1) =
[
Qif (t)− r f
(i,i(o)f )
(t), 0]+
+ r f(i(I)f ,i)
(t). (67)
4.3 Problem formulation
Assume that the MBS determines which paths to split data flow f with a given prob-
ability distribution, i.e., πππ f =(π1
f , · · · ,πZ f
f
), where for each m ∈ Z f we have πm
f =
Pr(
z f = zmf
)
. Here, πππ f is the probability mass function (PMF) of the flow-split vector,
i.e., ∑Z f
m=1 Pr(
zmf
)
= 1. We denote πππ =
πππ1, · · · ,πππ f , · · · ,πππF
∈ Π as the global prob-
ability distribution of all flow-split vectors in which Π is the set of all possible global
PMFs. Let x f denote the achievable average rate of flow f such that
x f , limt→∞
1
t
t−1
∑τ=0
x f (τ) ,andx f (τ) =
Z f
∑m=1,i
(o)f ∈Zm
f
EH,p
[πm
f r f(i,i
(o)f )
(τ)]∣∣∣i=0
.
We assume that the achievable rate is bounded, i.e.,
0 ≤ x f (t)≤ amaxf , (68)
where amaxf is the maximum achievable rate of flow f at every time t. Vector x =
(x1, · · · , xF) denotes the time average of rates over all flows. Let R denote the rate
region, which is defined as the convex hull of the average rates, i.e., x ∈ R .
We define U0 as the network utility function, i.e., U0 (x) = ∑ f∈F U(x f
)[110, 23].
Here, U(·) is assumed to be a twice differentiable, concave, and increasing L-Lipschitz
function for all x ≥ 0. According to Little’s law [125], the average queuing latency
74
is defined as the ratio of the queue length to the average arrival rate. By taking ac-
count of the probabilistic latency constraints for each flow/subflow, the network utility
maximization (NUM) is formulated as followsOP2: max
πππ ,x,pU0(x) (69a)
subject to Pr(Qi
f (t)
a f
≥ dth)
≤ ε ,∀t, f ∈ F , i ∈ B, (69b)
limt→∞
E
[
|Qif |]
t= 0,∀ f ∈ F ,∀i ∈ B, (69c)
x(t) ∈ R , (69d)
πππ ∈ Π, (69e)
and (63), (68),
where dth reflects the latency threshold required for UEs, and ε ≪ 1 is the target prob-
ability for reliable communication6. The probabilistic latency constraint (69b) implies
that the probability that the latency for each flow at node i is greater than dth is very
small, which captures the constraints of ultra-low latency and reliable communication
[72, 126]. It is also used to avoid congestion for each flow f at any point (BS) in the
network, since the queue length is ensured less than dtha f with probability 1−ε . Hence,
(69b) forces the transmission of all BSs without building large queues, and (69c) main-
tains network stability.
The above problem has a non-linear probabilistic constraint (69b), which cannot be
solved directly. Hence, we replace the non-linear constraint (69b) with a linear deter-
ministic equivalent by applying Markov’s inequality [127, 72] such that Pr(X ≥ x) ≤E [X ]/x for a non-negative random variable X and x > 0. Thus, we relax (69b) as
E[Qi
f (t)]≤ a f εdth. (70)
Assuming that a f (t) follows a Poisson arrival process [127], we derive the expected
queue length in (66) for i = 0 as
E[Qif (t)] = ta f −
t
∑τ=1
∑m=1,i
(o)f ∈Zm
f
πmf r f
(i,i(o)f )(τ), (71)
and the expected queue length in (67), for each SCBS, i.e.,
E[Qif (t)] =
t
∑τ=1
∑m
πmf
(
r f(i(I)f ,i)
(τ)− r f(i,i
(o)f )
(τ))
. (72)
6For the sake of simplicity, we assume that all UEs has same latency and reliability requirements
75
Subsequently, combining the constraints (70) and (71), we obtain the following linear
constraint (73) of instantaneous rate requirements, which helps to analyse and optimize
the URLLC problem [72, 126], for MBS i = 0,
a f (t − εdth)−t−1
∑τ=1
∑m=1,i
(o)f ∈Zm
f
πmf r f
(i,i(o)f )
(τ)≤ ∑m=1,i
(o)f ∈Zm
f
πmf r f
(i,i(o)f )
(t) . (73)
Similarly, for each SCBS i = 1, · · · ,B, we have
−a f εdth +t−1
∑τ=1
∑m
πmf
(
r f(i(I)f ,i)
(τ)− r f(i,i
(o)f )
(τ))
≤ ∑m
πmf
(
r f(i,i
(o)f )
(t)− r f(i(I)f ,i)
(t))
,
(74)
by combining (70) and (72). With the aid of the above derivations, we consider (73) and
(74) instead of (69b) in the original problem (69). In practice, the statistical informa-
tion of all candidate paths to decide πππ f ,∀ f ∈ F , is not available beforehand, and thus
solving (69) is challenging. One solution is that paths are randomly assigned to each
flow which does not guarantee optimality, whereas applying an exhaustive search is not
practical. Therefore, in this work, the Lyapunov stochastic optimization pertains to the
queuing network and characterizes the queuing latency in the presence of randomness
(mmWave wireless channels and arbitrary arrivals). As a result, (69) is decoupled into
sub-problems, which can be solved by low-complexity and efficient methods. In partic-
ular, RL is leveraged to find the best paths without requiring the statistic information,
and SCA method obtains a locally efficient solution for flow rate allocation.
4.4 Proposed path selection and rate allocation algorithm
In this section, we propose a Lyapunov optimization based framework to solve our pre-
defined problem (69) with relaxed latency constraints. To do that, we first introduce a
set of auxiliary variables to refine the original problem (69). Next, we convert the con-
straints into virtual queues and derive the conditional Lyapunov drift function. Finally,
the solution of the equivalent problem is obtained by minimizing the Lyapunov drift
and a penalty from the objective function. Let us start by rewriting (69) equivalently as
follows
RP2: maxϕϕϕ,πππ ,p
U0(ϕϕϕ) (75a)
subject to ϕ f − x f ≤ 0, ∀ f ∈ F , (75b)
(63), (68), (69c), (69e), (73), (74),
76
where the new constraint (75b) is introduced to replace the rate constraint (69d) with
new auxiliary variables ϕϕϕ = (ϕ1, · · · ,ϕF). In (75b), ϕϕϕ , limt→∞
1t ∑t−1
τ=0 E [|ϕϕϕ(τ)|]. In order
to ensure the inequality constraint (75b), we introduce a virtual queue vector Yf (t) ,
which is given by
Yf (t + 1) =[Yf (t)+ϕ f (t)− x f (t)
]+, ∀ f ∈ F . (76)
Let ΞΞΞ(t) = (Q(t), Y(t)) denote the queue backlogs. We first write the conditional Lya-
punov drift for slot t as
∆(ΞΞΞ(t)) = E
[
L(ΞΞΞ(t + 1))−L(ΞΞΞ(t)) |ΞΞΞ(t)]
, (77)
where L(
ΞΞΞ(t))
, 12
[
∑Ff=1 ∑B
i=0 Qif (t)
2 +∑Ff=1 Yf (t)
2]
is the quadratic Lyapunov func-
tion of ΞΞΞ(t) [77]. We apply the Lyapunov drift-plus-penalty technique [23, 77], at each
time slot t the solution of (75) is obtained by minimizing the Lyapunov drift and a
penalty from the objective function, i.e.,min ∆(ΞΞΞ(t))−νE [U0 (ϕϕϕ) |ΞΞΞ(t)] . (78)
Here, ν is a control parameter to trade off utility optimality and queue length [23, 77].
Moreover, the stability of ΞΞΞ(t) ensures that the constraints of problem (69c) and (75b)
are held. Noting that max[a,0]2 ≤ a2 and (a±b)2 ≤ a2 ±2ab+b2 for any real positive
number a,b, and thus, by neglecting other indexes t, f , . . ., we have:
(max [Q−R(o), 0]+R(I))2 −Q2 ≤ 2Q(R(I)−R(o))+ (R(I)−R(o))2,
(max [Q−R(o), 0]+ a)2−Q2 ≤ 2Q(a−R(o))+ (a−R(o))2,
max [Y +ϕ − x, 0]2 −Y 2 ≤ 2Y (ϕ − x)+ (ϕ − x)2.
Subsequently, following the calculations of the Lyapunov optimization [77], choosing
that ϕϕϕ ∈ R and a feasible π and all possible ΞΞΞ(t) for all t, we obtain
(78) ≤F
∑f=1
B
∑i=1
Qif E
[
∑m
πmf (r f
(i(I)f ,i)− r f
(i,i(o)f )
)|ΞΞΞ(t)]
−F
∑f=1
Qi|i=0
f E
[
∑m=1,i
(o)f ∈Zm
f
πmf r f
(i,i(o)f )|ΞΞΞ(t)
]
(79)
+F
∑f=1
E
[
Yf ϕ f −νU(ϕ f
)−Yf x f |ΞΞΞ(t)
]
+Ψ.
Here, Ψ is a finite constant that satisfies Ψ≥ 12 ∑F
f=1 ∑Bi=1 E
[
∑m πmf (r f
(i(I)f ,i)−r f
(i,i(o)f )
)2|ΞΞΞ(t)]+
12 ∑F
f=1 E[
∑m=1,i
(o)f ∈Zm
f
πmf (a f − r f
(i,i(o)f ))2|ΞΞΞ(t)
]+ 1
2 ∑Ff=1 E
[(ϕ f − x f )
2|ΞΞΞ(t)]
[77, 23].
77
The solution to (75) can be obtained by minimizing the upper bound in (79) without the
finite constant Ψ. For every slot t, observing ΞΞΞ(t), we have three decoupled subprob-
lems and provide the solutions for each subproblem as follows. The flow-split vector
and the probability distribution are determined by
SP1 : minπππ
F
∑f=1
ℵ f
subject to (69e),
whereℵ f =
B
∑i=1
Qif ∑
m
πmf
(
r f(i(I)f ,i)− r f
(i,i(o)f ))
−Qi|i=0
f ∑m=1,i
(o)f ∈Zm
f
πmf R
f
(i,i(o)f).
Then, we select the optimal auxiliary variables by solving
SP2: minϕϕϕ |πππ
F
∑f=1
[
Yf ϕ f −νU(ϕ f
)]
subject to ϕ f (t)≥ 0, ∀ f ∈ F .
Let ϕ∗f be the optimal solution obtained by the first order derivative of the objective func-
tion of SP2. Assuming a logarithmic utility function, we have ϕ∗f (t) = max
νYf, 0
.
Finally, the RA is done by assigning transmit power, which is obtained by
SP3: minx,p|πππ
F
∑f=1
−Yf x f
subject to (63), (68), (73), (74).
4.4.1 Path selection
Recall that z f represents the flow-split vector given to flow f and zmf = 1 when path
m is used to send data for flow f . The MBS selects paths for each flow with a given
probability (mixed strategy) [73, 80]. We denote umf = u f
(
zmf ,z
−mf
)
as a utility function
of flow f when using path m. The vector z−mf denotes the flow-split vector excluding
path m. The MBS can choose more than one path to deliver data, from SP1, the utility
gain of flow f is
u f = ∑m
umf =−ℵ f .
To exploit the historical information, the MBS determines a flow-split vector for each
flow f from Z f based on the PMF from the previous stage t − 1, i.e.,
π f (t − 1) =(
π1f (t − 1) , · · · ,πZ f
f (t − 1))
. (80)
78
Here, we define ΦΦΦ f (t) = (Φ1f (t) , · · · ,Φm
f (t) · · · ,ΦZ f
f (t)) as a regret vector of determin-
ing flow-split vector for flow f . The MBS selects the flow-split vector with highest
regret in which the mixed-strategy probability is given as
πmf (t) =
[
Φmf (t)
]+
∑m′∈Z f
[
Φm′f (t)
]+ . (81)
Let ΦΦΦ f (t) = (Φ1f (t) , · · · ,Φm
f (t) · · · ,ΦZ f
f (t)) be the estimated regret vector of flow f .
Basically, with the goal of maximizing the cumulative reward in SP1, the MBS (agent)
has to discover the possible paths (action set) in order to find the best paths (distribution
of actions with higher pay-off) in the long run [80]. If the MBS spends much time
on discovering paths (called exploration), it leads to longer convergence time. If the
MBS only exploits the action (called exploitation), which gave the highest pay-off at
the beginning, it may loose a chance to obtain higher reward later. Hence, balancing the
trade-off between exploration and exploitation is fundamental for efficient learning. For
this purpose, we have adopted the logit of Boltzmann-Gibbs (BG) kernel to efficiently
learn the best paths [80, 94], βββ mf
(
ΦΦΦ f (t))
, given by
βββ mf (ΦΦΦ f (t)) = argmax
πππ f ∈Π∑
m∈Z f
[πmf (t)Φ
mf (t)−κ f π
mf (t) ln(πm
f (t))], (82)
where the trade-off factor κ f is used to balance between exploration and exploitation
[128, 94, 129]. If κ f is small, the MBS selects z f with highest payoff. For κ f → ∞ all
decisions have equal probability.
For a given set of ΦΦΦ f (t) and κ f , we solve (82) to find the probability distribution in
which the solution determining the disjoint paths for each flow f is given as
β mf (ΦΦΦ f (t)) =
exp
(
1κ f
[
Φmf (t)
]+)
∑m′∈Z f
exp
(
1κ f
[
Φm′f (t)
]+) . (83)
We denote u(t) as the estimated utility of flow f at time instant t with action z f , i.e,
u f (t) = (u1f (t) , · · · , um
f (t) · · · , uZ f
f (t)). Upon receiving the feedback, u f (t) denotes the
utility observed by flow f , i.e., u f (t) = u f (t − 1), we propose the learning mechanism
at each time instant t as follows.
Learning procedure: The estimates of the utility, regret, and probability distribution
functions are performed, and are updated for all actions per path m as follows [73, 80]:
79
umf (t) = um
f (t − 1)+ ι (1)f (t)1z f =zm
f (
u f (t)− umf (t − 1)
)
Φmf (t) = Φm
f (t − 1)+ ι (2)f (t)
(
umf (t)− u f (t)− Φm
f (t − 1))
,
πmf (t) = πm
f (t − 1)+ ι (3)f (t)
(
β mf (ΦΦΦ f (t))−πm
f (t − 1))
,
(84)
Here, ι (1)f (t), ι (2)
f (t), and ι (3)f (t) are the learning rates [73, 80, 129]. Based on the prob-
ability distribution as per (84), the MBS determines the flow-split vector for each flow
f . Note that the learning-aided PS is performed in a long-term period to ensure that the
paths do not suddenly change such that the SCBSs have sufficient time to deliver traffic
from the queues. For instance, at the beginning of large time scale, the best paths are
selected, and will be used for the rest of these large time scale as shown in Fig. 13.
Here, we briefly establish the convergence conditions to the o-coarse correlated
equilibrium for the reinforcement learning based algorithm, where o is a very small
positive value [130]. The complete proof was studied in [94, 129], the learning rates
ι (1)f (t), ι (2)
f (t), and ι (3)f (t) are chosen to satisfy the convergence conditions as follows:
limt→∞
∑tτ=0 ι (1)
f (τ) = +∞, limt→∞
∑tτ=0 ι (2)
f (τ) = +∞,
limt→∞
∑tτ=0 ι (3)
f (τ) = +∞, limt→∞
∑tτ=0 ι (1)2
f (τ)<+∞,
limt→∞
∑tτ=0 ι (2)2
f (τ)<+∞, limt→∞
∑tτ=0 ι (3)2
f (τ)<+∞,
limt→∞
ι(3)f (t)
ι(2)f(t)
= 0, limt→∞
ι(2)f (t)
ι(1)f(t)
= 0.
4.4.2 Rate allocation
Consider r f(i, j) = log(1+ p
f
(i, j)|g(i, j)(h)|2) as the transmission rate, where the effective
channel gain7 for mmWave channels can be modeled as |g(i, j)(h)|2 =|g(i, j)(h)|2
1+Imax [23,
69]. Here, g(i, j)(h) and Imaxdenote the normalized channel gain and the maximum
interference, respectively. Denoting the left hand side (LHS) of (73) and (74) as Dfi for
simplicity, the optimal values of flow control x and transmit power p in the sub-problem
3 (SP3) are found by minimizing
7The effective channel gain captures the path loss, channel variations, and interference penalty (Here, the
impact of interference is considered small due to highly directional beamforming and high pathloss for inter-
fered signals at mmWave frequency band, and thus a multi-hop directional transmission can be operated at
dense mmWave networks [95, 96, 97, 98, 103]).
80
minx,p|πππ
F
∑f=1
−Yf x f (85a)
subject to 1+ pf
(i,i(o)f)|g
(i,i(o)f )
|2 ≥ ex f ,∀ f ∈ F , i = 0, (85b)
1+ pf
(i,i(o)f )
|g(i,i
(o)f )
|2
1+ pf
(i(I)f ,i)
|g(i(I)f,i)|2
≥ eDfi , f ∈ F ,∀i = 1 : B, (85c)
∑f∈F
pf
(i,i(o)f)≤ Pmax
i ,∀i ∈ B,∀ f ∈ F . (85d)
The constraint (85c) is non-convex, motivated by the low-complexity of SCA method,
we solve (85) by replacing (85c) with its proper convex approximation, but it is very
hard to find the convex approximation of (85c) [85, 131]. In this regard, we introduce
the slack variable y to transform (85c) into equivalent constraints, having a proper bound
satisfying the conditions in [85, Property A] as
2+ pf
(i,i(o)f )
|g(i,i
(o)f )
|2
2≥
√√√√
y2 +(
pf
(i,i(o)f )
|g(i,i
(o)f )
|2
2
)2
, (86)
y2
1+ pf
(i(I)f ,i)
|g(i(I)f ,i)
|2≥ eD
fi . (87)
Here, the constraint (86) holds a form of the second-order cone inequalities [131, 85,
132], while the LHS of constraint (87) is a quadratic-over-affine function which is itera-
tively replaced by the first order to achieve a convex approximation as follows [86, 133]
:
2yy(l)
1+ pf (l)
(i(I)f ,i)
|g(i(I)f ,i)
|2−
y(l)2(
1+ pf
(i(I)f ,i)
|g(i(I)f,i)|2)
(
1+ pf (l)
(i(I)f ,i)
|g(i(I)f ,i)
|2)2
≥ eDfi . (88)
Here, the superscript l denotes the lth iteration. Hence, we iteratively solve the approxi-
mated convex problem of (85) as Algorithm 4.1 in which the approximated problem is
given as
minx,p|πππ
F
∑f=1
−Yf x f (89)
subject to (85d), (68), (85b), (86), (88).
81
π(t− 1), Q(t− 1),Y(t− 1)
u(t), Φ(t);π(t)
z(t)
Learning in long-term period Rate allocation in short-term period
Regret learning based
path selection: SP1
Path distribution
estimation
DL transmission
Queue update
Auxiliary variable selection
SP2
Iterative rate allocation
SP3
'∗ p∗
Fig. 13. Information flow diagram of the learning-aided PS and RA approach ([75, 24] c©2019
IEEE).
Algorithm 4.1 Iterative RA
1: Initialization: set l = 0 and generate initial points y(l).
2: repeat
3: Solve (89) with y(l) to get the optimal value y(l)⋆.
4: Update y(l+1) := y(l)⋆; l := l + 1.
5: until Convergence
Finally, the information flow diagram of the learning-aided PS and RA approach
is shown in Fig. 13, where the RA is executed in a short-term period. Note that the
PS and RA are both done at the MBS, in this work we assume that the information is
shared among the base stations by using the X2 interface. As opposed to a brute-force
approach yielding the global optimal solution, the proposed iterative solution that uses
time scale separation remarkably reduces the search time and computational complexity,
while obtaining an efficient suboptimal solution8.
8Note that the problem of finding the global optimality is outside the scope of our study. The effectiveness of
SOCP method was verified in the literature and shown to be robust in practical scenarios [131].
82
4.5 Numerical results
In this section Monte Carlo simulations are carried out in order to evaluate the sys-
tem performance of our proposed algorithm. To solve Algorithm 1, we use YALMIP
toolbox to model the optimization problem with MOSEK as internal solver [88]. For
simulations, we assume that there are two flows from the MBS to two UEs, while the
number of available paths for each flow is four [110]. The MBS selects two paths from
four most popular paths9. Each path contains two relays, the total number of SCBSs is
8, and the one-hop distance is varying from 50 to 100 meters. The maximum transmit
power of MBS and each SC are 43 dBm and 30 dBm, respectively, and the SC antenna
gain is 5 dBi. The number of antennas Nb at each BS is set to 8 and 64 for small and
large antenna arrays, respectively. The number of antennas Nk at UE is set to 2 and 16,
for small and large antenna arrays, respectively. The number of RF chains at BS Rb and
UE Rk are set to 8 and 2, respectively.
For simulations purposes, the general channel model for arbitrary antenna arrays is
used. In particular, the estimate channel matrix H(i, j) ∈ CNi×N j of the channel matrix
H(i, j) ∈ CNi×N j between the transmitter i and the receiver j can be modeled as [25, 134]
H(i, j) =√
Ni ×N jΘΘΘ1/2
(i, j)
(√
1− τ2j W(i, j)+ τ jW(i, j)
)
,
where W(i, j) =[w1(i, j), · · · ,w
n j
(i, j), · · · ,wN j
(i, j)
]∈ CNi×N j is the small-scale fading channel
matrix, which is independent and identically distributed (i.i.d.) with zero mean and
variance 1Ni×N j
in which wn j
(i, j)∈CNi×1 is the small-scale fading channel vector between
the transmitter antenna array and the nthj antenna of receiver j. Here, τ j ∈ [0,1] reflects
the estimation accuracy for receiver j, if τ j = 0, then H(i, j) = H(i, j), the perfect channel
state information is assumed at the transmitters [135]. W(i, j) ∈ CNi×N j is the estimated
noise, also modeled as a realization of the circularly symmetric complex Gaussian distri-
bution matrix with zero mean and variance of 1Ni×N j
[23, 25]. Moreover, ΘΘΘ(i, j) ∈CNi×Ni
depicts the antenna spatial correlation matrix that accounts for the path loss and shadow
fading, such that Rank(Θ(i, j))≪ Ni.
We generate the spatial correlation matrix as Θ(i, j)=PL(i, j)Θ(i, j) with Rank(Θ(i, j))=
Ri, and the normalized spatial correlation matrix with Tr(Θ(i, j))=Ni [134]. The mmWave
path loss PL(i, j) is modeled as a distance-based path loss for urban environments at
28 GHz with a 1 GHz system bandwidth [136, 49], which may exist as a line-of-sight
9As studied in [110], it suffices for a flow to maintain at least two paths provided that it repeatedly selects
new paths at random and replaces if the latter provides higher throughput.
83
(LOS), non-LOS (NLOS), or blockage states. We adopt the mmWave channel model
used in the system level simulation in [136], given by
PL(d) = Pr(d)PLLOS(d)+ (1−Pr(d))PLNLOS(d),
where PLLOS(d) and PLNLOS(d) are the distance-based path loss for LOS and NLOS
states at distance d, respectively [136]. Here, Pr(d) denotes a boolean random variable
that is 1 with some probability. For the general blockage channel model, the LOS proba-
bility is defined as exp(−0.006d), then the NLOS probability is 1−exp(−0.006d) [136,
49]. For the analog beamforming, the side lobe gain Γ is set to 14, and the beamwidths
at the transmitter and receiver are set to π4
and π3
radians, respectively.
We assume that the traffic flow is divided equally into two sub-flows, the arrival rate
for each sub-flow is varying from 2 to 5 Gbps for small antenna array case. The maxi-
mum delay requirement β and the target reliability probability ε are set to be 10 ms and
5%, respectively [72]. For the learning algorithm, the Boltzmann temperature (trade-off
factor) κ f is set to 5, while the learning rates ι (1)(t), ι (2)(t), and ι (3)(t) are set to 1
(t+1)0.51 ,
1
(t+1)0.55 , and 1
(t+1)0.6 , respectively [129, 73]. The parameter settings are summarized in
Table 3.
To that end, we would like to notice that our work contains some main features: (i)
NUM [77, 101], (ii) dynamic path selection learning [94], and (iii) URLLC-aware rate
allocation [72]. We consider the following baselines: Baseline 1 employs features (i)
and (ii) , whereas Baseline 2 applies features (i) and (iii), finally Baseline 3 considers
only feature (i). We benchmark our work and these baselines to assess the impact of
the dynamic path selections and of the URLLC-constrained rate allocation, which has
not been addressed in the literature in the context of mmWave communications. In
addition, Single hop scheme considers that the MBS delivers data to UEs over one
single hop at long distance in which the probability of LOS communication is low, and
then the blockage needs to be taken into account [136].
4.5.1 Small antenna array system
We first evaluate the network performance under the small antenna array setting, i.e.,
Ni = 8, N j = 2. In Fig. 14, we report the average one-hop latency10 versus the mean
arrival rates µ . As we increase µ , baselines 3 , 222, and 111 violate the latency constraints
at µ = 3.5, 4.5, and 5 Gbps, respectively. While the average latency of our proposed
10The average end-to-end latency is defined as the sum of the average one-hop latency of all hops.
84
Table 3. Parameter settings ([24] c©2019 IEEE)
Parameter Value
B, K, F 8, 2, 4
Number of BS antennas Nb 8,64
Number of UE antennas Nk 2,16
Maximum latency β 10 ms
Target reliability ε 0.05,0.1,0.15
Boltzmann temperature 2,5,10,20,50
Path loss model
LOS @ 28 GHz 61.4+ 20log(d)dB
NLOS @ 28 GHz 72+ 29.2log(d)dB
System bandwidth 1 GHz
algorithm is gradually increased with µ , but under the warming level, β = 10 ms. The
reason is that the latency requirement is satisfied via the equivalent instantaneous rate
by our proposed algorithm as per (73) and (74), while the baselines 1 and 3 use the tradi-
tional utility-latency trade-off approach without considering the latency constraint, and
the baseline 2 considers the random PS mechanism only. The benefit of applying the
learning path algorithm is that selecting the path with high payoff and less congestion,
results in small latency. Let us now take a look at µ = 4.5 Gbps, the average one-hop la-
tency of baseline 1 with learning outperforms baselines 2 and 3, whereas our proposed
scheme reduces latency by 50.64%, 81.32% and 92.9% as compared to baselines 1, 2,
and 3, respectively. When µ = 5 Gbps, the average latency of all baselines increases
dramatically, violating the latency requirement of 10 ms, while our proposed scheme is
robust to the latency requirement.
In Fig. 25, we report the tail distribution (complementary cumulative distribution
function (CCDF)) of latency to showcase how often the system achieves a latency
greater than the target latency levels [137] as µ = 4.5 Gbps, ε = 5%, β = 10 ms. In
contrast to the average latency, the tail distribution is an important metric to reflect the
URLLC characteristic. For instance, at µ = 4.5 Gbps, by imposing the probabilistic
latency constraint, our proposed approach ensures reliable communication with bet-
ter guaranteed probability, i.e, Pr(latency > 10ms) < 10−6. In contrast, baseline 1
with learning violates the latency constraint with high probability, where Pr(latency >
85
2 2.5 3 3.5 4 4.5 5Mean arrival rate [Gbps]
0
5
10
15
20
25
30
Aver
age
one-
hop l
aten
cy [
ms]
Proposed Algorithm
Baseline 1
Baseline 2
Baseline 3
= 10 ms, = 0.05
Fig. 14. Average one-hop latency versus mean arrival rates ([75, 24] c©2019 IEEE).
10ms) = 0.08 and Pr(latency > 25ms) < 10−6, while the performance of baselines
2 and 3 gets worse. For instance, as shown in Fig. 25, baselines 2 and 3 obtain
Pr(latency > 10ms)> 0.12 and Pr(latency > 10ms) > 0.24, respectively. For through-
put comparison, we observe that for µ = 4.5 Gbps, our proposed algorithm is able to
deliver 4.4874 Gbps of average network throughput per each sub-flow, while the base-
lines 1, 2, and 3 deliver 4.4759, 4.4682, and 4.3866 Gbps, respectively. Here, the Single
hop scheme only delivers 3.55 Gbps due to the high path loss, causing large latency.
Note that in this work we mainly focus on the low latency scale, i.e., 1−10 ms, the
target achievable rate for all schemes is very high and close to each other. Hence, we
report the average MBS queue length instead of the average achievable rate. Generally
speaking, as per (66), the average achievable rate can be extracted from the average
MBS queue length and the mean arrival rate, i.e., x f = µ f − Q f . In Fig 16, we plot the
average queue length of the MBS as a function of mean arrival rates. As we increase
the mean arrival rate from 2 to 5 Gbps, the average MBS queue length of our proposed
algorithm is increased from 0.01 Gb to 0.04 Gb, which means that the average latency
at the MBS is increased from 5 ms to 8 ms, which meet the latency constraint (69b).
In contrast, the average queue length of the baselines is increased up to 16 ms, which
violates the latency constraint (69b).
86
0 5 10 15 20 25 30 35 40
One-hop latency [ms]
10-6
0.05
0.12
0.240.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CC
DF
Proposed Algorithm
Baseline 1
Baseline 2
Baseline 3
BL1: Pr(delay >10)>0.08
BL3: Pr(delay >10)>0.24
BL2: Pr(delay >10)>0.12
Proposed: Pr(delay >10)<10-6
Fig. 15. CCDF of one-hop latency, small antenna array ([75, 24] c©2019 IEEE).
2 2.5 3 3.5 4 4.5 5
Mean arrival rate [Gbps]
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Aver
age
MB
S q
ueu
e le
ngth
[G
b] Proposed Algorithm
Baseline 1
Baseline 2
Baseline 3
Fig. 16. Average MBS queue length versus mean arrival rate ([24] c©2019 IEEE).
4.5.2 Large antenna array system
In order to achieve higher beamforming gain, large antenna arrays are employed at
both transmitter and receiver, i.e., Ni = 64, N j = 16. In this setting, the maximum
transmit power at the MBS is adjusted to 41 dBm only and the transmitter beamwidth
87
1 3 5 10 15 20 25 30 35 40
Latency [ms]
10-5
10-4
10-3
10-2
0.05
10-1
100
CC
DF
Proposed Algorithm - LOS
Proposed Algorithm - Blockage
Baseline 1 - LOS
Baseline 2 - LOS
Baseline 3 - LOS
All schemes satisfy
latency constraint
due to higher antenna gain
when mean arrival rate is small
Fig. 17. CCDF of one-hop latency, large antenna array, µ = 4.5 Gbps ([24] c©2019
IEEE).
is reduced to 0.5 radian. Our proposed algorithm is evaluated under both LOS and
blockage channel states, whereas all baselines are using the LOS communication model
[136, 49, 138], [139]. First, in Fig. 17 we plot the the CCDF of one-hop latency (in
logarithmic scale) of all schemes when the mean arrival rate is 4.5 Gbps, which is the
same mean admission rate as used in Fig. 25. Interestingly, due to higher antenna gains
all schemes do not violate the latency constraint with an upper bound of 10 ms and a
target probability of 5% as illustrated in Fig. 17. However, baseline 3 does not employ
the two important features (ii) dynamic path selection learning, and (iii) URLLC-aware
rate allocation, and thus, baseline 3 has a longer tail of latency distribution.
Next we increase the mean arrival rate to showcase the trade-off between latency
and network arrival rate. Fig. 18 reports the CCDF of one-hop latency of all schemes
with the increasing mean arrival rate, i.e., µ = 9.5. It can be observed that the perfor-
mance of our proposed algorithm is degraded under the impact of blockage channels in
which the distribution of the latency has a longer tail than baseline 1. With increasing
the mean arrival rate, baselines 2 and 3 violate the latency constraint with high probabili-
ties, such that Pr(latency> 10ms)> 10% for baseline 2 and Pr(latency> 10ms)> 20%
for baseline 3. The latency of all schemes increases as we increase the network arrival
rate, which showcases the trade-off between the latency and network arrival rate.
88
0 3 5 10 15 20 25 30 35 40 45 50 55
Latency [ms]
10-5
10-4
10-3
10-2
10-1
0.5
1
CC
DF
Proposed Algorithm - LOS
Proposed Algorithm - Blockage
Baseline 1 - LOS
Baseline 2 - LOS
Baseline 3 - LOS
Not all schemes meet
latency constraint
when mean arrival rate
is higher
Fig. 18. CCDF of one-hop latency, large antenna array, µ = 9.5 Gbps ([24] c©2019
IEEE).
4.5.3 Convergence characteristics
We plot the convergence of the iterative algorithm as a function of the number of hops
as shown in Fig. 19. Here, we provide the distribution of the number of iterations
of the SOCP-based algorithm in which the convergence criteria stops running with an
accuracy of 10−2. With increasing the number of hops, the number of constraints and
variables is increased, and thus the number of iterations required by the algorithm for
convergence is higher. Intuitively, our proposed algorithm only needs few iteration to
converge at each time slot t as shown in Fig. 19. For example, for three hop transmis-
sion, the probability that the number of iterations takes a value less than or equal to 7 is
90%.
4.6 Summary and discussion
In this chapter, the author proposed a multi-hop multi-path scheduling scheme to sup-
port reliable communication incorporating the probabilistic latency constraint and traf-
fic splitting techniques in 5G mmWave networks. In particular, the problem was mod-
eled as a network utility maximization subject to bounded latency with a guaranteed
reliability probability, and network stability. Massive MIMO and mmWave communi-
89
0 1 2 3 4 5 6 7 8 9 10 11 12 13
Number of Iterations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
5 Hops - 21 BSs
3 Hops - 9 BSs
Fig. 19. The iterative algorithm convergence ([24] c©2019 IEEE).
cation techniques were employed to further improve the DL transmission of multi-hop
self-backhauled small cells. By leveraging stochastic optimization, the problem was
decoupled into PS and RA, which were solved by applying the reinforcement learning
and successive convex approximation methods, respectively. A comprehensive perfor-
mance analysis of our proposed algorithm was mathematically provided. Numerical
results show that our proposed framework significantly reduces the latency compared
to the baselines with and without learning, respectively.
This chapter addressed the problem of selecting the best paths from many possible
paths in multi-hop multi-path mmWave networks. The traffic aggregation was assumed
to be done perfectly, which is not practical, and thus a possible research direction is
to investigate the impact of imperfect traffic aggregation at the UEs. For simulation
purpose, a simple assumption was made when the traffic was spitted equally among the
flows. In fact, the weight of each traffic flow should be proportional to the route load or
other design metrics.
Moreover, the state-action space is much larger in ultra-dense networks, hence the pro-
posed reinforcement learning solution would be limited due to the relatively slow con-
vergence speed of reinforcement learning. Hence, in order to obtain to a faster solution,
the concept of deep reinforcement learning should be leveraged.
90
In the next chapters 5 and 6, the author takes a closer look at the access links and study
simplified scenarios. Specifically, chapter 5 considers a single cell massive MIMO
system, whereby a macro BS equipped with a large antenna array to serve multiple
outdoor users and the SCs are replaced by normal UEs. By doing so, the author focuses
on a simplified scenario to find a solution providing low latency communication with
eMBB services. Moreover, chapter 6 focuses on ultra-dense SC networks, in which
the macro cell layer is removed from the scenario. Chapter 6 studies the problem of
providing reliable communication, which aims to achieve high average mean rates, but
a small variance.
91
92
5 Low-latency communication in massive
MIMO wireless networks
This chapter examines a simplified scenario focusing on a single cell massive MIMO
wireless network in which wireless backhaul SCs are not considered, but the studied
problem can be straightforwardly extended to the multi-cell scenario where the SCs
act as normal UEs. To that end, this chapter answers the third question, Q3 of how to
provide low-latency communication for eMBB services, which is an essential issue in
5G wireless networks.
5.1 Main contributions and related work
Most of the existing works on mmWave-enabled massive MIMO systems focus mainly
on providing capacity improvements, while latency and reliability are not addressed.
Although latency and reliability are applicable to many scenarios (e.g. mission-critical
applications), this chapter is concerned with addressing the fundamental question in
mmWave-enabled massive MIMO systems of how to simultaneously provide order of
magnitude capacity improvements and latency reduction. To this end, the Lyapunov
framework is extended to incorporate probabilistic latency constraints, taking into ac-
count the queue state, arrival rate, and channel dynamics with a guaranteed probability.
5.2 System model
Consider the downlink (DL) transmission of a single cell massive MIMO system11 con-
sisting of one macro base station (MBS) equipped with N antennas, and a set, M = 1,
. . . ,M, of single-antenna user equipments (UEs). We assume that N ≥ M and N ≫ 1.
Further, co-channel time-division duplexing (TDD) is considered in which the MBS
estimates channels via the uplink phase. We denote the propagation channel between
the MBS and the mth UE as Hm =√
NΘΘΘ1/2m Hm, where ΘΘΘm ∈CN×N depicts the antenna
spatial correlation, and the rank of spatial correlation matrix ΘΘΘm is much smaller than
11The studied model can be extended to multi-cell massive MIMO systems in which the problem of inter-cell
interference can be addressed by designing a hierarchical precoder at the MBS to mitigate both intra-cell and
inter-cell interference, or by applying an interference coordination approach [25].
93
number of antennas due to limited spatial scattering MIMO environment. Moreover, the
spatial channel model is clustered, which belongs to a finite set with a finite size [25].
The elements of Hm ∈ CN×1 are independent and identically distributed (i.i.d.) with
zero mean and variance 1/N. In addition, the channels experience flat and block fading,
and imperfect channel state information (CSI) is assumed. As per [41], the estimated
channel can be modelled as
Hm =√
1− τ2mHm + τm
√NΘΘΘ1/2
m zm,∀m ∈ M .
Here, zm ∈ CN×1 denotes the estimated noise vector which has i.i.d. elements with a
zero mean and a variance of 1/N, and τm ∈ [0,1] reflects the estimation error; in case
of perfect CSI, τm = 0.
Given the estimated channel matrix H = [H1, · · · ,HM] ∈ CN×M , the MBS employs
beamforming techniques to exploit the spatial multiplexing gains of a massive MIMO
system [25, 140]. Within the scope of this chapter, we consider a digital beamforming
scheme for a single cell massive MIMO system, whereas a hybrid beamforming design
can be applied for more complex systems, which is left for the future work [113, 114].
In particular, MBS utilizes the regularized zero-forcing (RZF) precoder with a precod-
ing matrix, V = [V1, · · · ,VM] ∈ CN×M , which is given by V =(H†H+Nζ IN
)−1H†
[25, 41]. Note that the regularization parameter ζ > 0 is scaled by N to ensure the ma-
trix H†H+Nζ IN is well-conditioned as N → ∞ [25]. Denoting all allocated powers in
the diagonal matrix P = diag(p1, · · · , pM), we get the constraint Tr(PV†V
)≤ P, with P
the maximum transmit power of the MBS. With the aid of the results in [41, Theorem
1], the transmit power constraint is derived as
1N
M
∑m=1
pmΩm
≤ P, and pm ≥ 0, ∀m ∈ M , (90)
where the parameter Ωm is the solution to Ωm = 1N
Tr(ΘΘΘm
(1N ∑M
m=1ΘΘΘm
ζ+Ωm+ IN
)−1). By
designing the precoding matrix V and transmit power vector p = (p1, · · · , pM), the Er-
godic DL rate of UE m ∈ M is expressed as
rm(p) =E[
log(
1+ pm|H†mVm|2
∑Mk=1,k 6=m pk|H†
mVk |2+σ 2m
)]
, (91)
Here, the thermal noise of user m is ηm ∼ CN (0,σ2m). The Ergodic DL rate in (91) in-
volves a stochastic expectation over a CSI realization and does not have a closed-form
expression [25]. We invoke results from random matrix theory to obtain the determin-
istic equivalence for the Ergodic DL rate [25, 41]. In particular, as N ≥ M and N ≫ 1,
94
for a small fixed ζ , the Ergodic DL rate almost surely converges to
rm(p)a.s.−−→ log
(
1+ pm(1− τ2m))
, ∀m ∈ M , (92)
wherea.s.−−→ denotes almost sure convergence [25], [41, Theorem 2]. Moreover, we
assume that the MBS has queue buffers to store the UE data [77]. The queue length for
UE m at time slot t is denoted by Qm(t) which evolves as follows
Qm(t + 1) = [Qm(t)− rm(t)]++ am(t), ∀m ∈ M , (93)
where am(t) is the data arrival rate of UE m. Further, we assume that am(t) is i.i.d. over
time slots with a mean arrival rate of am and upper bounded by amaxm [77].
5.3 Problem formulation
According to Little’s law [125], the average latency is proportional to limT→∞
1
T
T
∑t=1
E[Qm(t)]/am.
We use Qm(t)/am as a latency measure and enforce an allowable upper bound dthm . Note
that the latency bound violation is related to reliability. Thus, taking into account the
latency and reliability requirements, we characterize the latency bound violation with
a tolerable probability. Specifically, we impose a probabilistic constraint on the queue
size length for UE m ∈ M as follows:
Pr
Qm(t)am
≥ dthm
≤ εm, ∀ t. (94)
In (94), dthm reflects the upper bound of UE latency requirement. Here, εm ≪ 1 is the
target probability for reliable communication.
To avoid the over-allocation of network resources to the UEs, i.e., rm(t) ≫ Qm(t),
we incorporate a maximum rate constraint rmaxm for each UE m, i.e., rmax
m := minrmaxm ,
Qm(t). Moreover, we enforce the MBS to guarantee for all UEs a certain level of QoS,
i.e., the minimum rate requirement rminm ,∀m ∈ M .
We define the network utility as ∑Mm=1 ωm f (rm), where rm = limT→∞
1T ∑T
t=1E[rm(t)]
denotes the time average expected rate and ωm represents the non-negative weight for
each UE m. Additionally, we assume that f (·) is a strictly concave, increasing, and
twice continuously-differentiable function. Taking into account these constraints pre-
95
sented above yields the following network utility maximization
OP3 : maxP(t)
M
∑m=1
ωm f (rm) (95a)
subject to rminm ≤ rm(t)≤ rmax
m , ∀m ∈ M , ∀ t, (95b)
(90) and (94).
Our main problem involves a probabilistic constraint (94), which cannot be addressed
tractably. To overcome this challenge, we apply Markov’s inequality [127] to linearize
(94) such that PrQm(t)
am≥ dth
m
≤ E[Qm(t)]
amdthm
. Then, (94) is satisfied if
E[Qm(t)]≤ amdthm εm, ∀m ∈ M , ∀ t. (96)
Thereafter, we consider (96) to represent the latency and reliability constraint. Assum-
ing that am(t)|∀ t ≥ 1 is a Poisson arrival process [127], we note that E[Qm(t)] =
tam −∑tτ=1 rm(τ) which is plugged into (96). Finally, we obtain
rm(t)≥ tam − amdthm εm −
t−1
∑τ=1
rm(τ), ∀m ∈ M , ∀t, (97)
which represents the minimum rate requirement in slot t for UE m for low latency
communication. Here, we transform the probabilistic latency and reliability constraint
(94) into one linear constraint (97) of instantaneous rate requirements, which helps to
analyse and optimize the URLLC problem. Combining (95b) and (97), we rewrite OP
as follows
maxP(t)
M
∑m=1
ωm f (rm) (98a)
subject to r0m(t)≤ rm(t)≤ rmax
m , ∀m ∈ M , ∀ t, (98b)
and (90),
with r0m(t) = maxrmin
m , tam − amdthm εm −∑t−1
τ=1 rm(τ).
5.4 Proposed control parameter selection and power allocation
To tackle (98), we resort to the Lyapunov framework [77]. Firstly, for each DL rate
rm(t), we introduce the auxiliary variable vector ϕϕϕ(t) = (ϕm(t)|∀m ∈ M ) that satisfies
ϕm = limT→∞
1T
T
∑t=0
E
[ϕm(t)
]≤ rm, ∀m ∈ M , (99)
ϕ0m(t)≤ ϕm(t)≤ rmax
m , ∀m ∈ M , ∀t, (100)
96
with ϕ0m(t) = maxrmin
m , tam− amdthm εm −∑t−1
τ=1 ϕm(τ). Incorporating the auxiliary vari-
ables, (98) is equivalent to
RP3 : maxP(t),ϕϕϕ(t)
limT→∞
1T
T
∑t=1
M
∑m=1
ωmE[ f (ϕm(t))]
subject to (90), (99), and (100).
In order to ensure the inequality constraint (99), a virtual queue vector Y(t)= (Ym(t)|∀m∈M ) is introduced, where each element evolves according to
Ym(t + 1) =[Ym(t)+ϕm(t)− rm(t)
]+, ∀m ∈ M . (101)
Subsequently, we express the conditional Lyapunov drift-plus-penalty for each time slot
t as:
E
[M
∑m=1
[12Ym(t + 1)2 − 1
2Ym(t)
2 −νm(t)wm f (ϕm(t))]∣∣Y(t)
]
. (102)
In (102), νm(t) is the control parameter which affects the utility-queue length trade-off.
This control parameter is conventionally chosen to be static and identical for all UEs
[77]. However, this setting does not hold for system dynamics (e.g., instantaneous data
arrivals) or the diverse system configurations (i.e., different latency and QoS require-
ments). Thus, we dynamically design these control parameters. From the analysis in
the Lyapunov optimization framework [77], we can find Ym(t) ≤ νm(t)ωmπm + amaxm
with πm being the largest first-order derivative of f (x). Letting ωm = 1,∀m ∈ M , we
have the lower bound πmνm(t) ≥ ν0m(t),∀m ∈ M , for selecting the control parameters,
where ν0m(t) = maxYm(t)− amax
m ,1. Subsequently, following the straightforward cal-
culations of the Lyapunov drift-plus-penalty technique, we obtain
(102) ≤E[
M
∑m=1
(Ym(t)ϕm(t)−νm(t)ωm f
(ϕm(t)
))(103a)
−M
∑m=1
Ym(t)rm
(P(t)
)+C∣∣Y(t)
]
. (103b)
Due to space limitation, we omit the details of the constant value C which does not
influence the system performance [77]. We note that the solution to LP is acquired
by minimizing the right-hand side (RHS) of (103a) and (103b) in every slot t. Further,
(103a) is related to the reliability and QoS requirements while (103b) reflects optimal
power allocation to UEs.
97
Algorithm 5.1 CCP algorithm for solving sub-problem (104) ([72] c©2017 IEEE).
1: m ∈ M
2: Initialize i = 0 and a feasible point ν(i)m in (104b).
3: repeat
4: Convexify g0(νm,ν(i)m ) = g0(ν
(i)m )+∇g0(νm −ν
(i)m ).
5: Solve:
6: minϕm,νm
h0(ϕm,νm)− g0(νm,ν(i)m )+Ymϕm
7: subject to (104b) and (104c),
8: Find the optimal ϕ(i)⋆m and ν
(i)⋆m .
9:
10: Update ν(i+1)m := ν
(i)⋆m and i := i+ 1.
11: until Convergence
5.4.1 Control parameters selection
Considering the logarithmic fairness utility function, i.e., f (x) = log(x), minimizing the
RHS of (103a) for each m ∈ M is formulated as
minϕm(t),νm(t)
Ym(t)ϕm(t)−νm(t) log(ϕm(t)
)(104a)
subject to πmνm(t)≥ ν0m(t), (104b)
r0m(t)≤ ϕm(t)≤ rmax
m . (104c)
Before proceeding with (104), we rewrite −νm(t) log(ϕm(t)) in (104a), for any ϕm(t)>
0 and νm(t)> 0, as
νm(t) log
(νm(t)
ϕm(t)
)
︸ ︷︷ ︸
h0(ϕm,νm)
−νm(t) log(νm(t)
)
︸ ︷︷ ︸
g0(νm)
,
in which both h0(ϕm,νm) (i.e., the relative entropy function) and g0(νm) (i.e., negative
entropy function) are convex functions. Since (104a) is the difference between convex
functions while constraints (104b) and (104c) are affine functions, problem (104) falls
under DC programming [141], which can be efficiently and iteratively addressed by the
CCP [142]. The CCP algorithm to obtain the solution to problem (104) is detailed in
Algorithm 5.1, which probably converges to the local optima of DC programming [142]
(please refer to [142] for the formal proof).
98
5.4.2 Power allocation
The optimal transmit power in (103b) is computed by
minP(t)
−M
∑m=1
Ym(t)rm(P(t))
subject to (90).
Here, the objective function is strictly convex for pm(t) ≥ 0,∀m ∈ M , and the con-
straints are compact. Therefore, the optimal solution of P⋆(t) exists.
After obtaining the optimal auxiliary variable and transmit power, we update the
queues Qm(t + 1) and Ym(t + 1) as per (93) and (101), respectively.
5.5 Numerical results
We consider a single-cell massive MIMO system in which the MBS, with N = 32 an-
tennas and P = 38 dBm, is located at the center of a 0.5× 0.5 km2 square area. UEs
(from 8 to 60 UEs per km2) are randomly deployed within the MBS’s coverage with a
minimum MBS-UE distance of 35 m. Data arrivals follow a Poisson distribution with
different means, and the rate requirements are specified as rmaxm = 1.2am,r
minm = 0.8am,
∀m∈ M . The system bandwidth is 1 GHz. The path loss is modeled as a distance-based
path loss with high probability from the line-of-sight (LOS) model for urban environ-
ments at 28 GHz [49]. dth and ε are set to 10ms and 5%, respectively. The numerical
results are obtained via Monte-Carlo simulations over 10000 channel realizations. Fur-
thermore, we compare our proposed scheme with the following baselines:
– Baseline 1 refers to the Lyapunov framework in which the probabilistic latency con-
straint (94) is considered.
– Baseline 2 is a variant of Baseline 1 without the probabilistic latency constraint (94).
5.5.1 Impact of the arrival rate
In Fig. 20, we report the average latency versus the mean arrival rates a = E[a(t)]
for M = 16. At a low a, no schemes violate latency constraints, and our proposed
algorithm outperforms other baselines with a small gap. At a higher a, the average
latency of baseline 2 increases dramatically as a > 1.8 Gbps, since baseline 2 does not
incorporate the latency constraint, whereas our proposed scheme reduces latency by
28.41% and 77.11% compared to baselines 1 and 2, respectively, when a = 2.4 Gbps.
99
1.6 1.8 2 2.2 2.4 2.6
Mean arrival rate [Gbps]
2
4
6
8
10
12
14
16
18
20
22
Av
erag
e la
ten
cy [
ms]
Proposed algorithm
Baseline 1
Baseline 2
dth
=10 ms, = 5%
Fig. 20. Average latency versus mean arrival rates, M = 16 per km2 ([72] c©2017 IEEE).
When a > 2.4 Gbps, the average latency of all schemes increases, violating the latency
requirement of 10 ms. It can be observed that under a limited maximum transmit power,
with a very high traffic demand, the latency requirement could not be guaranteed. This
highlights the tradeoff between the mean arrival rate and latency. In Fig. 21, we report
the tail distribution (complementary cumulative distribution function (CCDF)) of the
latency to showcase how often the system achieves a latency greater than target latency
levels. In particular, at a = 2.4 Gbps, by imposing the probabilistic latency constraint
(94), our proposed approach and baseline 1 ensure reliable communication with better
guaranteed probabilities, i.e, Pr(latency > 7.5ms) < 10−4 and Pr(latency > 9.4ms) <
10−4, respectively. In contrast, baseline 2 violates the latency constraint with a high
probability, where Pr(latency > 10ms) = 74.75%.
100
5 8 10 15 20 25 30 35 40 45
Latency [ms]
0.00010.050.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1C
CD
F
Proposed algorithm
Baseline 1
Baseline 2
Proposed algorithm
Baseline 1
Baseline 2
Proposed algorithm
Baseline 1
Baseline 2
BL2: Pr(delay > 10) = 74.75%
Proposed: Pr(delay > 7.4) < 1e-4
BL1: Pr(delay > 9.4) < 1e-4
λ = 2 Gbps
λ = 2.4 Gbps
λ = 2.6 Gbps
Fig. 21. Tail distribution (CCDF) of latency ([72] c©2017 IEEE).
5.5.2 Impact of user density
In Fig. 22, we compare the average user throughput (avgUT) and average latency of our
proposed approach with the two baselines under the impact of user density, when a = 2
Gbps. Additionally, we consider the weighted sum rate maximization (WSRM) case.
The WSRM case is used to find the system throughput limit but suffers from higher
latency. Since all users share the same resources, the average latency (“solid lines”)
increases with the number of users M, whereas the avgUT (“dash lines") decreases.
Fig. 22 further shows that when M > 24, the latency of all schemes increases dramati-
cally and is far-above the latency requirement. Hence, only a limited number of users
can be served to guarantee the latency requirement, above which, a tradeoff between
latency and network density exists. Our proposed approach achieves better through-
put and a higher latency reduction than baselines 1 and 2, while the WSRM case has
the worst latency performance as expected. Moreover, our proposed approach reaches
Gbps capacity, which represents the capacity improvement brought by the combination
of mmWave and massive MIMO techniques. Compared with WSRM, our proposed
approach maintains at least 87% of the throughput limit, while achieving up to 80%
latency reduction. Numerical results show that our approach simultaneously provides
order of magnitude capacity improvements and latency reduction.
101
8 12 16 20 24 28 32 36 40 44 48 52 56 60Number of nodes per km
2
0.5
1
1.5
2
2.5
3
3.5
Av
erag
e u
ser
thro
ug
hp
ut
[Gb
ps]
WSRM
Proposed algorithm
Baseline 1
Baseline 2
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
Av
erag
e la
ten
cy [
ms]
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
0
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
Throughput
Latency
Fig. 22. Average latency and avgUT versus number of users per km2 ([72] c©2017 IEEE).
5.6 Summary and discussion
This chapter investigated the problem of mmWave-enabled massive MIMO networks
from a latency standpoint. Specifically, the problem was modeled as a NUM problem
subject to the probabilistic latency constraint and QoS/rate requirement. Numerical
results show that the proposed approach reduces the latency by 28.41% and 77.11%
compared to current baselines.
The proposed solution can be straightforwardly extended to a multi-cell scenario in
which the problem of inter-cell interference can be addressed by designing a hierarchi-
cal precoder to mitigate both intra-cell and inter-cell interference, or by applying an
interference coordination approach. In addition, achieving lower latency communica-
tion, multi-connectivity and antenna diversity should be investigated in the future work.
This chapter addressed the problem of low latency communication. In the next chapter,
the last question Q4 will be answered to provide more reliable communication in ultra-
dense SC networks in the presence of risk and uncertainty.
102
6 Ultra-reliable communication in 5G mmWave
networks
This chapter addresses another key concern in 5G wireless networks, which is reliability.
Specifically, research question Q4 is answered, which enables ultra-reliable communi-
cation in ultra-dense SC networks in the presence of risk and uncertainty. Note that for
the sake of simplification, this chapter does not consider the macro cell, but the pro-
posed approach can be applied directly to other studied scenarios in previous chapters.
6.1 Main contributions and related work
A unique peculiarity of mmWave bands is that mmWave links are very sensitive to
blockage, which gives rise to unstable connectivity and unreliable communication. To
overcome this challenge, the author leverages principles of risk-sensitive reinforcement
learning (RSL) and exploits multiple antenna diversity and higher bandwidth to opti-
mize the transmission to achieve gigabit data rates, while considering the sensitivity of
mmWave links to provide ultra-reliable communication (URC). The prime motivation
behind using RSL stems from the fact that the risk-sensitive utility function to be op-
timized is a function of not only the average but also the variance [143], and thus it
captures the tail of rate distribution to thus enable URC.
Related work
In [5] the authors provided principles of wireless communication to support URLLC
such as the use of antenna diversity, network base station densification, and flexible
frame/network designs. [6] briefly defined the latency and reliability concepts, and fur-
ther described some techniques to support URLLC with respect to the risk, tail and scale.
In particular, the risk involves the decision making under uncertainly in the presence of
highly fluctuating channel and network dynamics; the tail is related to the tail behaviour
of random traffic arrival or rate distributions under worse channel state; and scale is con-
nected to the case when large numbers of devices are deployed, which requires URLLC
that poses resource allocation and network design challenges [6]. Recently, the problem
of low latency communication [144] and URLLC [72] for 5G mmWave networks was
103
studied to evaluate the performance under the impact of traffic dispersion and network
densification. All these works focus on maximizing the time average of the network
throughput or minimizing the mean latency without providing any guarantees for higher
order moments (e.g., variance, skewness, kurtosis, etc.). This chapter departs from the
classical average-based system design and instead takes account of higher order mo-
ments in the utility function to formulate an RSL framework in which every small cell
optimizes its transmission while taking into account signal fluctuations.
6.2 System model
Let us consider the mmWave downlink (DL) transmission of a small cell network con-
sisting of a set B of B small cells (SCs), and a set K of K user equipments (UEs)
equipped with Nk antennas. We assume that each SC is equipped with a large number
of Nb antennas to exploit massive MIMO gain and adopt a hybrid beamforming archi-
tecture [120], and we assume that Nb ≫ Nk ≥ 1 . Without loss of generality, one UE per
one SC is considered12. The data traffic is generated from the SC to UE via mmWave
communication. A co-channel time-division duplexing protocol is considered, in which
the DL channel can be obtained via the uplink training phase.
Each SC adopts the hybrid beamforming architecture, which enjoys both analog and
digital beamforming techniques [120]. Let g(tx)bk and g
(rx)bk denote the analog transmitter
and receiver beamforming gains at the SC b and UE k, respectively. In addition, we
use ω(tx)bk and ω
(rx)bk to represent the angles deviating from the strongest path between
the SC b and UE k. Also, let θ(tx)bk and θ
(rx)bk denote the beamwidth at the SC and UE,
respectively. We denote θθθ as a vector of the transmitter beamwidth for all SCs. We
adopt the widely used antenna radiation pattern model [120] to determine the analog
beamforming gain as
gbk (ωbk,θbk) =
2π−(2π−θbk)Γθbk
, if |ωbk| ≤ θbk2,
Γ, otherwise,(106)
where 0 < Γ ≪ 1 is the side lobe gain.
Let Hbk ∈ CNb×Nk denote the channel propagation matrix (channel state) from SC b
to UE k. We assume a time-varying channel state described by a Markov chain and
there are T ∈ Z+ states, i.e., for each Hbk(t), t = 1, . . . ,T. Considering the imperfect
12For the multiple UE case, additional channel estimation and user scheduling need to be considered. One
example was studied in [23].
104
channel state information (CSI), the estimated channel state between the SC b and UE
k is modelled as [72]
Hbk =√
Nb ×NkΘΘΘ1/2
bk
(√
1− τ2k Wbk + τkWbk
)
,
where ΘΘΘbk ∈ CNb×Nb is the spatial channel correlation matrix with a low rank that ac-
counts for the mmWave channel path loss and shadow fading [25, 136]. Moreover, the
spatial channel model is clustered, which belongs to a finite set with a finite size [25].
Here, Wbk ∈ CNb×Nk is the small-scale fading channel matrix, modelled as a random
matrix with a zero mean and a variance of 1Nb×Nk
. Here τk ∈ [0,1] reflects the estima-
tion accuracy for UE k, if τk = 0, and we assume perfect channel state information.
Wbk ∈ CNb×Nk is the estimated noise vector, also modeled as a random matrix with
a zero mean and a variance of 1Nb×Nk
. We denote H = Hbk|∀b ∈ B,∀k ∈ K as the
network state.
By applying a linear precoding scheme Vbk(Hbk) [120], i.e, Vbk(Hbk) = Hbk for the
conjugate precoding, the achievable rate13 of UE k from SC b can be calculated as
rb (t) = wlog(
1+pbg
(tx)bk g
(rx)bk |H†
bkVbk|2
∑b′ 6=b pb′g(tx)b′k g
(rx)b′k |H†
b′kVb′k|2 +σ2bk
)
,
where pb and pb′ are the transmit powers of SC b and SC b′, respectively. In addition,
w denotes the system bandwidth of the mmWave frequency band. The thermal noise
of user k served by SC b is ηbk ∼ CN (0,σ2bk) . Here, we denote Pmax
b as the maximum
transmit power of SC b and p = (pb|∀b ∈ B, 0 ≤ pb ≤ Pmaxb ) as the transmit power
vector.
6.3 Problem formulation
We model a decentralized optimization problem and harness tools from RSL to solve
it, whereby the SCs autonomously respond to the network states based on the his-
torical data. Let us consider a joint optimization of transmitter beamwidth14 θθθ and
transmit power allocation p. We denote z(t) = (θθθ (t) ,p(t)), which takes values in
Z = z1, · · · ,zB, where zb =(θb, pb). Assume that each SC b selects its beamwidth and
transmit power drawn from a given probability distribution πππb =(π1
b , · · · ,πmb , · · · ,π
Zb
b
)
13Note that we omit the beam search/track time, since it can be done in a short time compared to transmission
time [119]. We assume that each BS sends a single stream to its users via the main beams.14As studied in [120], for η ≤ 1
3, the problem of selecting the beamwidth for the transmitter and receiver can
be done by adjusting the transmitter beamwidth with a fixed receiver beamwidth.
105
in which Zb is the cardinality of the set of all combinations (θb, pb), i.e., ∑Zbm=1 πm
b = 1.
For each m = 1, · · · ,Zb and zmb = (θ m
b , pmb ) the mixed-strategy probability is defined
as
πmb (t) = Pr
(
zb(t) = zmb |zb(0 : t − 1),πππb(0 : t − 1)
)
. (107)
We denote πππ = πππ1, · · · ,πππb, · · · ,πππB ∈ Π, in which Π is the set of all possible probabil-
ity mass functions (PMF). Let r = (r1, · · · ,rB) denote the instantaneous rates, in which
rb = (rb(0), · · · ,rb(T )). Let R denote the rate region, which is defined as the convex
hull of the rates [131], i.e., r∈ R . Inspired by the RSL [143], we consider the following
utility function, given by
ub =1
µb
logEH,πππ
[
exp(µb
T
∑t=0
rb(t))
]
, (108)
where the parameter µb < 0 denotes the desired risk-sensitivity, which will penalize the
variability [143] and the operatorE denotes the expectation operation.
Remark 6.1. The Taylor expansion of the utility function given in (108) yields
ub ,EH,πππ
[T
∑t=0
rb(t)
]
+µb
2VarH,πππ
[T
∑t=0
rb(t)
]
+O(µ2
b
).
Remark 1 basically shows that the utility function (108) considers both mean and vari-
ance terms (Var) of the mmWave links. We formulate the following distributed opti-
mization problem for every SC as
OP4: maxπππb
1
µb
logEH,πππb
[
exp(µb
T
∑t=0
rb(t))]
(109a)
subject to rb ∈ R , πππb ∈ Π, pb ≤ Pmaxb . (109b)
It is challenging to solve (109) if each SC is not able to fully observe the network
observation. This work does not assume an explicit knowledge of the state transition
probabilities. Here, we leverage the principles of RL to optimize the transmit beam in
a totally decentralized manner [143, 80, 94].
6.4 Proposed distributed learning algorithm
In Fig. 23 each SC acts as an agent which selects an action to maximize a long-term
reward based on user feedback and a probability distribution for each action. The action
106
Agent
Action (t)Observation
Environment
Reward (t)
NewState(t+1)
t t+ 10 T − 1 T
New State
Feedback
Uplink training phase Downlink transmission phase Uplink transmission and feedback phase
Time indices for each Episode
t t+ 1
:::
Episode 1
Episode 2
Episode 3
NLOS
LOS
NLOS Episode representation for simulation
:::
Fig. 23. A reinforcement learning model ([73] c©2018 IEEE).
is defined as the selection of zb, while the long-term utility in (109) is the reward, and
the environment here contains the network state. To this end, we build the probability
distribution for every action and provide a RL procedure to solve (109). We denote
umb = um
b
(zm
b ,z−b
)as a utility function of SC b when selecting zm
b . Here, z−b denotes
the composite variable of other agents’ actions excluding SC b. From (108), the utility
ub (t) of SC b at time slot t, i.e., ub = ∑Tt=0 ub (t), is rewritten as
ub (t) =1
µb
log
(Zb
∑m=1
πmb exp
(
µbrmb
(zm
b (t) ,z−b
)))
, (110)
where rmb (z
mb (t) ,z−b) is the instantaneous rate of SC b when choosing zm
b (t) = (θ mb (t) ,
pmb (t)) with a probability of πm
b (t).
Remark 6.2. For a small µb, (108) is approximated via the Taylor approximation15 of
rb around µb −→ 0 as
ub =1
µb
E
[T
∑t=0
(exp(µbrb(t)
)− 1)
]
, (111)
=1
(T + 1)
T
∑t=0
exp(µbrb(t)
)− 1
µb
, (112)
15For a small x > 0, the Taylor approximation of log (x) is x−1.
107
where (112) is obtained by expanding the time average of (111). Each SC determines
(θ mb , pm
b ) from Zb based on the probability distribution from the previous stage t − 1,
i.e.,
πππb (t − 1) =(
π1b (t − 1) , · · · ,πZb
b (t − 1))
. (113)
We introduce the Boltzmann-Gibbs (BG) distribution to capture the exploitation and
exploration, βββ b (ub(t)), given by
βββ mb (ub(t)) = argmax
πππb∈Π∑
m∈zb
[
πmb um
b (t)−κbπmb ln(πm
b )]
, (114)
where ub(t) =(
u1b (t) , · · · ,u
Zb
b (t))
is the utility vector of SC b for zb ∈ Zb, and the
trade-off factor κb is used to maintain the balance between exploration and exploitation.
If κb is small, the SC selects zb with highest payoff. For κb → ∞ all decisions have
equal chance.
For a given ub(t) and κb, we solve (114) to find the probability distribution, and by
adopting the notion of logit equilibrium [80, 94], we have
β mb (ub(t)) =
exp(
1κb
[um
b
]+)
∑m′∈Zb
exp(
1κb
[um′
b
]+) , (115)
where [x]+ ≡ max[x,0]. Finally, we propose two coupled RL processes that run in
parallel and allow SCs to decide their optimal strategies at each time instant t as follows
[80, 94].
Risk-sensitive learning procedure: We denote ub(t) as the estimate utility of SC b,
in which the estimate utility and probability mass function are updated for each action
m ∈ Zb as follows:
umb (t) = um
b (t − 1)+ ι (1)b (t)1zb(t)=zm
b×(ub(t − 1)− um
b (t − 1)),
πmb (t) = πm
b (t − 1)+ ι (2)b (t)
(β m
b (ub(t))−πmb (t − 1)
),
where ι (1)b(t) and ι (2)
b(t) are the learning rates which satisfy the following conditions
(due to space limits please see [80, 94] for convergence proof):
limT→∞ ∑Tt=0 ι (1)
b (t) = +∞, limT→∞ ∑Tt=0 ι (2)
b (t) = +∞.
limT→∞ ∑Tt=0 ι (1)2
b(t) = +∞, limT→∞ ∑t
t=0 ι (2)2b
(t) = +∞.
limt→∞ι(2)b
(t)
ι(1)b
(t)= 0.
Finally, each SC determines zmb as per (113).
108
6.5 Numerical results
Dense SCs are randomly deployed in a 0.5× 0.5 km2 area and we assume one UE per
each SC with a fixed user association. We assume that each SC adjusts its beamwidth
with a step of 0.05 radian from the range [θ min, θ max], where θ min = 0.2 radian and
θ max = 0.4 radian denote the minimum and maximum beamwidths of each SC, respec-
tively. The transmit power level set of each SC is 21, 23, 25 dBm and the SC antenna
gain is 5 dBi. The number of transmit antennas Nb and receive antennas Nk at the SC and
UE are set to 64 and 4, respectively. The blockage is modeled as a distance-dependent
probability state where the channel is either line-of-sight (LOS) or non-LOS for urban
environments at 28 GHz and the system bandwidth is 1 GHz [136]. Numerical results
are obtained via Monte-Carlo simulations over 50 different random topologies. The
risk-sensitive parameter is set to µb = −2. For the learning algorithm, the trade-off
factor κb is set to 5, while the learning rates ι (1)b (t) and ι (2)
b (t) are set to 1
(t+1)0.55 and
1
(t+1)0.6 , respectively [94]. Furthermore, we compare our proposed RSL scheme with
the following baselines:
– Classical Learning (CSL) refers to the RL framework in which the utility function
only considers the mean value of mmWave links [94].
– Baseline 1 (BL1) refers to [120] optimizing the beamwidth with maximum transmit
power.
In Fig. 25, we plot the complementary cumulative distribution function (tail distribu-
tion - CCDF) of user throughput (UT) at 28 GHz when the number of SCs is 24 per
km2. The CCDF curves reflect the reliable probability (in both linear and logarithmic
scales), defined as the probability that the UT is higher than a target rate r0 Gbps, i.e,
Pr(UT≥r0). We also study the impact of imperfect CSI with τk = 0.3 and feedback
with noise from UEs. We observe that the performance of our proposed RSL frame-
work is reduced under these impacts. We next compare our proposed RSL method with
other baselines with perfect CSI and user feedback. It is observed that the RSL scheme
achieves better reliability, Pr(UT≥10 Gbps), of more than 85%, whereas the baselines
CSL and BL1 obtain less than 75% and 65%, respectively. However, at very low rate
(less than 2 Gbps) or very high rate (10.65− 11 Gbps) captured by the cross-point, the
RSL obtains a lower probability as compared to the baselines. In other words, our pro-
posed solution provides a UT which is more concentrated around its median in order to
provide uniformly great service for all users. For instance, the UT distribution of our
109
16 32 48 64 80 96 112 128Number of SCs per km
2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Rel
iabil
ity:
Ur
r0
/U
Risk-sensitive learning
Classical learning
Baseline 1
16 32 48 64 96 1280.98
0.99
1
r0 = 2 Gbps
r0 = 3 Gbps
r0 = 4 Gbps
Fig. 24. Reliability versus network density ([73] c©2018 IEEE).
Fig. 25. Tail distribution of the achievable rate, B = 24 ([73] c©2018 IEEE).
proposed algorithm has a small variance of 0.4846, while the CSL has a higher variance
of 2.6893.
Fig. 24 reports the impact of the network density on the reliability, which is defined
as the fraction of UEs who achieve a given target rate r0, i.e.,Kr>r0
K. Here, the number of
SCs varies from 16 to 128 per km2. For given target rates of 2, 3, and 4 Gbps, our pro-
posed algorithm guarantees higher reliability compared to the baselines. Moreover, the
higher the target rate, the bigger the performance gap between our proposed algorithm
110
16 32 48 64 80 96 112 128Number of SCs per km
2
2
3
4
5
6
7
8
9
10
Avai
labil
ity [
Gbps]
Risk-sensitive learning
Classical learning
Baseline 1
80% availability
90% availability
Fig. 26. Availability versus network density ([73] c©2018 IEEE).
and the baselines. A linear increase in the network density reduces reliability, for exam-
ple, when the density increases from 16 to 96, the fraction of users that achieve 4 Gbps
of the RSL, CSL, and BL1 are reduced by 11.61%, 16.72%, and 39.11%, respectively.
This highlights a key trade-off between reliability and network density.
In Fig. 26 we show the impact of the network density on the availability, which
defines what rate is obtained for a target probability. We plot the 80% and 90% proba-
bilities in which the system achieves a rate of at least r Gbps. For a given target prob-
ability of 90%, our proposed algorithm guarantees more than 9 Gbps of UT, whereas
the baselines guarantee less than 7.5 Gbps of UT for B = 16, while if we lower the
target probability to 80%, the achievable rate is increased by 5%. This gives rise to a
tradeoff between the reliability and the data rate. In addition, for a given probability, the
achievable rate r is reduced with an increase in network density. For instance, when the
network density increases from 16 to 80, the achievable rate is reduced by 50%. This
highlights the tradeoff between availability and network density.
We numerically observe that T = 4000 is long enough for the agents to learn and
enjoy the optimal solution. We assume that the channel condition is changed after every
T = 4000. Our proposed algorithm converges faster than the classical learning baseline
as shown in Fig. 27. By harnessing the risk-averse notion, the agents attemp to find the
best strategy subject to the variations of the mmWave rates.
111
1200500 1000 1500 2000 2500 3000 3500 4000
Iterations
0
2
4
6
8
10
12
14
Ach
ievab
le R
ate
[Gbps]
Risk-sensitive learning
Classical learning
Proposed algorithm converges faster
Fig. 27. Convergence of the proposed RSL and classical RL ([73] c©2018 IEEE).
6.6 Summary and discussion
This chapter studied the problem of providing multi-gigabit wireless access with reli-
able communication by optimizing the transmit beam and considering the link sensi-
tivity in 5G mmWave networks. A distributed risk-sensitive RL based approach was
proposed taking into account both mean and variance values of the mmWave links. Nu-
merical results show that our proposed approach provides better services for all users.
For instance, the proposed approach achieves a Pr(UT≥ 10Gbps) which is higher than
85%, whereas the baselines obtain less than 75% and 65% with 24 small cells.
As studied in Chapter 4, the proposed reinforcement learning algorithm works only in
static and sparse networks. In a high mobility environment, a fast convergent solution is
required. Together with the problem of beam selection and power allocation, the beam
tracking and alignment become more challenging in high mobility mmWave networks.
112
7 Conclusions and future work
This chapter concludes the thesis and provides several future directions in view of up-
coming 5G wireless systems and beyond.
7.1 Conclusions
The focus of this thesis is to propose an integrated access-backhaul architecture for the
deployment of 5G wireless networks and beyond. As networks become denser in terms
of the numbers of users and base stations, it is highly challenging to implement network
planning and optimization. To achieve this, joint resource allocation and interference
mitigation schemes were proposed to answer the aforementioned fundamental questions
in Chapters 3-6 under different network architectures. By leveraging three key enabling
technologies, namely mmWave communication, massive MIMO, and dense small cells,
this thesis finds solutions to provide high data rates, low latency, and high reliability.
The research results also provide deployment guidelines for 5G wireless networks and
beyond, summarized as follows:
Chapter 3 answered the questions: for a given target UE throughput, what the opti-
mal number of UEs to be scheduled and what the optimal/maximum number of SCs to
be deployed would be. The studied problem was decoupled into the dynamic schedul-
ing of UEs, the backhaul provisioning of in-band FD-enabled SCs, and offloading UEs
to in-band FD-enabled SCs as a function of interference, as well as number of antennas,
and backhaul loads. In addition, the results show that at higher frequency bands FD-
enabled SCs work better in an open access mode than in a closed access mode under
the same transmit power budget. In particular, with increasing SC density, open access
FD-enabled SCs achieve 5.6× gains in terms of cell-edge performance compared to the
closed access ones in ultra-dense networks with 350 small cell base stations per km2.
Chapter 4 provided solutions for the problem of multi-hop multi-path transmis-
sions in mmWave networks. In particular, the solutions provide guidelines for selecting
the best paths between possible paths and how to assign transmission rates over these
paths, while satisfying probabilistic latency constraint and maintaining network stabil-
ity. Reinforcement learning was employed by utilizing the benefits of historical infor-
mation to select the best paths based on their empirical distributions. A probabilistic
113
latency constraint was incorporated into the rate allocation problem, so that an upper
latency bound could be guaranteed within a small reliable probability.
In 5G networks and beyond, an important concern is how to support ultra-low la-
tency and highly reliable communications. Chapter 5 discussed the latency issue in
mmWave-enabled massive MIMO systems in which a latency bound violation was
characterized with a tolerable probability. The research results demonstrated that for
a limited maximum transmit power, with very high traffic demands, the latency require-
ment could not be guaranteed. This highlights the tradeoff between the mean arrival
rate and latency. In addition, only a limited number of users can be served to guarantee
the delay requirement, above which, a trade-off between latency and network density
exists.
In Chapter 6, a novel solution was proposed to provide Gbps data transmission with
reliable communication in mmWave environments, where the channels are highly fluc-
tuational and the links are sensitive to blockages. A new approach departs from the clas-
sical average-based system design and instead takes account of higher order moments
in the utility function to formulate a risk-sensitive reinforcement learning framework
through which every small cell exploits the diversity of multiple antennas and higher
bandwidth to optimize theirs transmission while taking into account signal fluctuations.
In particular, the proposed solution provided a UT which is more concentrated around
its median to support a uniformly high level of service for all users. For instance, the UT
distribution of our proposed algorithm has a small variance of 0.4846, while the CSL
has a higher variance of 2.6893. The results established important trade-offs between
network density and reliability/availability.
In summary, an integrated access-backhaul architecture was proposed for the de-
ployment of future networks. In particular, as networks grow denser in terms of users
and small cell base stations, the proposed IAB architecture simultaneously schedules
the users and provides a wireless backhaul for the dense deployment of small cells.
In this regard, joint resource allocation and interference mitigation solutions were pro-
posed for two-hop and multi-hop self-backhauled millimeter Wave (mmWave) networks.
Further, the thesis provides solutions to support low latency and reliable communica-
tions, where the key trade-offs were established such as between network density and
latency/reliability.
114
7.2 Future work
First, some ideal assumptions in this thesis should be pointed out. In particular, the self-
interference cancellation, backhaul synchronization, channel reciprocity, and traffic ag-
gregation were assumed to be done perfectly. One of the most important extensions is to
investigate the impact of imperfect SIC on the performance of IAB systems and further
develop algorithms, seeking a near-perfect SIC performance. The imperfect backhaul
synchronization/traffic aggregation causes additional latency. In fact, the above assump-
tions made in the thesis, provide the upper bounds for the achievable performance in
practice, and thus, future work should take these non-ideal assumptions into account to
bridge the performance gap between theory and practice. Moreover, when considering
the beamforming techniques for mmWave communications, Chapters 3 and 5 studied a
simple system model in which digital beamforming was employed, while analog beam-
forming was not properly introduced. In this regard, future work would be to employ
hybrid beamforming techniques to improve the beamforming gain and reduce the power
consumption and hardware cost with limited number of radio-frequency chains.
Finally, some possible research directions are listed as follows:
– There is a need to evaluate the impact of imperfect SIC on the IAB system for both
TDD and FDD protocols.
– In mmWave communications, in high mobility environments, the coherence time is
much shorter, and thus, an ultra-fast and efficient beamforming tracking and align-
ment is of high importance.
– The proposed reinforcement learning algorithms allow a distributed manner for indi-
vidual network elements to independently operate. However, the main drawback of
RL is its slow convergence speed when the state-action spaces are large. Especially,
dynamic networks with high mobility demanding high reliability and low latency
require optimal solutions in a reasonable time. In this regard, deep reinforcement
learning is a promising solution to obtain a faster convergence speed and handle a
large number of state-action pairs.
– A large network optimization problem becomes extremely challenging, and even in
higher frequency band environments, the interference can be less severe. In particular,
by involving a large number of users, a dense deployment of base stations, with high
mobility, varying traffic demands and QoS requirements, the problem of universal
load balancing and interference management (ULBIM) becomes extremely complex
as multi-variables are not easy to decouple. Unavoidably, 5G and beyond networks
115
are seeking new powerful tools to solve the ULBIM problem. In this regard, artifi-
cial intelligence and machine learning are currently being investigated for wireless
networks.
116
References
[1] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and J. C. Zhang,
“What will 5G be?” IEEE Journal on Selected Areas in Communications, vol. 32, no. 6,
pp. 1065–1082, June 2014.
[2] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski, “Five disruptive
technology directions for 5G,” IEEE Communications Magazine, vol. 52, no. 2, pp. 74–80,
Feb. 2014.
[3] P. Popovski, “Ultra-reliable communication in 5G wireless systems,” Proceedings - Inter-
national Conference 5G for Ubiquitous Connectivity (5GU), pp. 146–151, 2014.
[4] G. Durisi, T. Koch, and P. Popovski, “Toward massive, ultrareliable, and low-latency wire-
less communication with short packets,” Proceedings of the IEEE, vol. 104, no. 9, pp.
1711–1726, 2016.
[5] P. Popovski, J. J. Nielsen, C. Stefanovic, E. de Carvalho, E. Strom, K. F. Trillingsgaard,
A.-S. Bana, D. M. Kim, R. Kotaba, J. Park et al., “Wireless access for ultra-reliable low-
latency communication: Principles and building blocks,” IEEE Networks, vol. 32, no. 2,
pp. 16–23, 2018.
[6] M. Bennis, M. Debbah, and H. V. Poor, “Ultra-reliable and low-latency wireless commu-
nication: Tail, risk and scale,” Proceedings of the IEEE, vol. 106, no. 10, pp. 1834–1853,
2018.
[7] C. Bockelmann, N. Pratas, H. Nikopour, K. Au, T. Svensson, C. Stefanovic, P. Popovski,
and A. Dekorsy, “Massive machine-type communications in 5G: Physical and MAC-layer
solutions,” IEEE Communications Magazine, vol. 54, no. 9, pp. 59–65, 2016.
[8] Z. Dawy, W. Saad, A. Ghosh, J. G. Andrews, and E. Yaacoub, “Toward massive machine
type cellular communications,” IEEE Wireless Communications, vol. 24, no. 1, pp. 120–
128, 2017.
[9] K. Miyauchi, “Millimeter-Wave Communication,” Infrared and millimeter waves, vol. 9,
1983.
[10] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N. Wong, J. K. Schulz,
M. Samimi, and F. Gutierrez Jr, “Millimeter wave mobile communication for 5G cellular:
It will work!” IEEE Access, vol. 1, pp. 335–349, 2013.
[11] Y. Niu, Y. Li, D. Jin, L. Su, and A. V. Vasilakos, “A survey of millimeter wave communi-
cation (mmWave) for 5G: opportunities and challenges,” Wireless Networks, vol. 21, no. 8,
pp. 2657–2676, 2015.
[12] F. Gómez-Cuba, E. Erkip, S. Rangan, and F. J. González-Castaño, “Capacity scaling of
cellular networks: Impact of bandwidth, infrastructure density and number of antennas,”
IEEE Transactions on Wireless Communications, vol. 17, no. 1, pp. 652–666, 2018.
117
[13] M. Xiao, S. Mumtaz, Y. Huang, L. Dai, Y. Li, M. Matthaiou, G. K. Karagiannidis, E. Björn-
son, K. Yang, I. Chih-Lin et al., “Millimeter wave communications for future mobile net-
works,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 9, pp. 1909–
1935, 2017.
[14] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station
antennas,” IEEE Transactions on Wireless Communications, vol. 9, no. 11, pp. 3590–3600,
2010.
[15] J. Hoydis, K. Hosseini, S. Brink, and M. Debbah, “Making smart use of excess antennas:
Massive MIMO, Small Cells, and TDD,” Bell Labs Technical Journal, vol. 18, no. 2, pp.
5–21, 2013.
[16] F. Rusek, D. Persson, B. Lau, E. Larsson, T. Marzetta, O. Edfors, and F. Tufvesson, “Scal-
ing up MIMO: Opportunities and challenges with very large arrays,” IEEE Signal Process-
ing Magazine, vol. 30, no. 1, pp. 40–60, 2013.
[17] V. Chandrasekhar, J. G. Andrews, and A. Gatherer, “Femtocell networks: a survey,” IEEE
Communications Magazine, vol. 46, no. 9, 2008.
[18] J. G. Andrews, “Seven ways that hetnets are a cellular paradigm shift,” IEEE Communica-
tions Magazine, vol. 51, no. 3, pp. 136–144, 2013.
[19] A. Anpalagan, M. Bennis, and R. Vannithamby, Design and deployment of small cell net-
works. Cambridge University Press, 2015.
[20] M. Bennis, M. Simsek, A. Czylwik, W. Saad, S. Valentin, and M. Debbah, “When cellular
meets WiFi in wireless small cell networks,” IEEE Communications Magazine, vol. 51,
no. 6, pp. 44–50, 2013.
[21] K. Hosseini, J. Hoydis, S. Brink, and M. Debbah, “Massive MIMO and Small Cells:
How to densify heterogeneous networks,” Proceedings - IEEE International Conference
on Communications (ICC), pp. 5442–5447, 2013.
[22] N. Bhushan, J. Li, D. Malladi, R. Gilmore, D. Brenner, A. Damnjanovic, R. Sukhavasi,
C. Patel, and S. Geirhofer, “Network densification: The dominant theme for wireless evo-
lution into 5G,” IEEE Communications Magazine, vol. 52, no. 2, pp. 82–89, 2014.
[23] T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah, and M. Latva-aho, “Joint load balancing
and interference mitigation in 5G heterogeneous networks,” IEEE Transactions on Wire-
less Communications, vol. 16, no. 9, pp. 6032–6046, 2017.
[24] T. K. Vu, M. Bennis, M. Debbah, and M. Latva-aho, “Joint path selection and rate al-
location framework for 5G self-backhauled mmWave networks,” IEEE Transactions on
Wireless Communications, vol. 18, no. 4, pp. xxxx–xxxx, 2019.
[25] A. Liu and V. Lau, “Hierarchical interference mitigation for massive MIMO cellular net-
works,” IEEE Transactions on Signal Processing, vol. 62, no. 18, pp. 4786–4797, 2014.
118
[26] D. López-Pérez, A. Valcarce, G. De La Roche, and J. Zhang, “OFDMA femtocells: A
roadmap on interference avoidance,” IEEE Communications Magazine, vol. 47, no. 9,
2009.
[27] N. Saquib, E. Hossain, L. B. Le, and D. I. Kim, “Interference management in OFDMA fem-
tocell networks: Issues and approaches,” IEEE Wireless Communications, vol. 19, no. 3,
2012.
[28] C. H. de Lima, M. Bennis, and M. Latva-aho, “Coordination mechanisms for self-
organizing femtocells in two-tier coexistence scenarios,” IEEE Transactions on Wireless
Communications, vol. 11, no. 6, pp. 2212–2223, 2012.
[29] T. K. Vu, K. Sungoh, and O. Sangchul, “Cooperative interference mitigation algorithm
in heterogeneous networks,” IEICE Transation on Communications, vol. 98, no. 11, pp.
2238–2247, 2015.
[30] E. Bastug, M. Bennis, M. Kountouris, and M. Debbah, “Cache-enabled small cell net-
works: Modeling and tradeoffs,” EURASIP Journal on Wireless Communications and Net-
working, vol. 2015, no. 1, p. 41, 2015.
[31] D. Bharadia, E. McMilin, and S. Katti, “Full duplex radios,” in ACM SIGCOMM Computer
Communication Review, vol. 43, no. 4. ACM, 2013, pp. 375–386.
[32] A. Sabharwal, P. Schniter, D. Guo, D. W. Bliss, S. Rangarajan, and R. Wichman, “In-band
full-duplex wireless: Challenges and opportunities,” IEEE Journal on selected areas in
communications, vol. 32, no. 9, pp. 1637–1652, 2014.
[33] L. Song, R. Wichman, Y. Li, and Z. Han, Full-duplex communications and networks.
Cambridge University Press, 2017.
[34] G. R. Kenworthy, “Self-cancelling full-duplex RF communication system,” 1997, uS
Patent 5,691,978.
[35] S. Hong, J. Brand, J. I. Choi, M. Jain, J. Mehlman, S. Katti, and P. Levis, “Applications
of self-interference cancellation in 5G and beyond,” IEEE Communications Magazine,
vol. 52, no. 2, pp. 114–121, 2014.
[36] G. Liu, F. R. Yu, H. Ji, V. C. Leung, and X. Li, “In-band full-duplex relaying: A survey,
research issues and challenges,” Resource, vol. 147, p. 172, 2015.
[37] Z. Zhang, X. Chai, K. Long, A. V. Vasilakos, and L. Hanzo, “Full duplex techniques for
5G networks: self-interference cancellation, protocol design, and relay selection,” IEEE
Communications Magazine, vol. 53, no. 5, pp. 128–137, 2015.
[38] M. S. Elbamby, M. Bennis, W. Saad, M. Debbah, and M. Latva-Aho, “Resource optimiza-
tion and power allocation in in-band full duplex-enabled non-orthogonal multiple access
networks,” IEEE Journal on Selected Areas in Communications, vol. 35, no. 12, pp. 2860–
2873, 2017.
119
[39] J. Du, E. Onaran, D. Chizhik, S. Venkatesan, and R. A. Valenzuela, “Gbps user rates using
mmWave relayed backhaul with high-gain antennas,” IEEE Journal on Selected Areas in
Communications, vol. 35, no. 6, pp. 1363–1372, 2017.
[40] E. Castaneda, A. Silva, A. Gameiro, and M. Kountouris, “An overview on resource alloca-
tion techniques for multi-user MIMO systems,” IEEE Communications Surveys & Tutori-
als, vol. 19, no. 1, pp. 239–284, 2017.
[41] S. Wagner, R. Couillet, M. Debbah, and D. Slock, “Large system analysis of linear precod-
ing in correlated MISO broadcast channels under limited feedback,” IEEE Transactions
on Information Theory, vol. 58, no. 7, pp. 4509–4537, 2012.
[42] E. G. Larsson, O. Edfors, F. Tufvesson, and T. L. Marzetta, “Massive MIMO for next
generation wireless systems,” IEEE Communications Magazine, vol. 52, no. 2, pp. 186–
195, 2014.
[43] L. Sanguinetti, A. Moustakas, and M. Debbah, “Interference management in 5G reverse
TDD HetNets: A large system analysis,” IEEE Journal on Selected Areas in Communica-
tions, vol. 33, pp. 1187–1200, 2015.
[44] J. Flordelis, F. Rusek, F. Tufvesson, E. G. Larsson, and O. Edfors, “Massive MIMO perfor-
mance TDD versus FDD: What do measurements say?” IEEE Transactions on Wireless
Communications, vol. 17, no. 4, pp. 2247–2261, 2018.
[45] N. Akbar, N. Yang, P. Sadeghi, and R. A. Kennedy, “Multi-cell multiuser massive MIMO
networks: User capacity analysis and pilot design,” IEEE Transactions on Communica-
tions, vol. 64, no. 12, pp. 5064–5077, 2016.
[46] X. Zhu, Z. Wang, L. Dai, and C. Qian, “Smart pilot assignment for massive MIMO,” IEEE
Communications Letters, vol. 19, no. 9, pp. 1644–1647, 2015.
[47] J.-C. Shen, J. Zhang, and K. B. Letaief, “Downlink user capacity of massive MIMO under
pilot contamination,” IEEE Transactions on Wireless Communications, vol. 14, no. 6, pp.
3183–3193, 2015.
[48] E. Björnson, E. G. Larsson, and M. Debbah, “Massive MIMO for maximal spectral effi-
ciency: How many users and pilots should be allocated?” IEEE Transactions on Wireless
Communications, vol. 15, no. 2, pp. 1293–1308, 2016.
[49] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S. Rappaport, and E. Erkip,
“Millimeter wave channel modeling and cellular capacity evaluation,” IEEE J. Sel. Areas
Commun., vol. 32, no. 6, pp. 1164–1179, Jun. 2014.
[50] A. L. Swindlehurst et al., “Millimeter-wave massive MIMO: The next wireless revolu-
tion?” IEEE Communications Magazine, vol. 52, no. 9, pp. 56–62, Sep. 2014.
[51] 3GPP, “Technical Specification Group Radio Access Network; Study on Integrated Access
and Backhaul for NR,” 3rd Generation Partnership Project (3GPP), Technical Specification
(TS) 38.874, 2018, rel-15.
120
[52] C. Dehos, J. L. González, A. De Domenico, D. Ktenas, and L. Dussopt, “Millimeter-wave
access and backhauling: the solution to the exponential data traffic increase in 5G mobile
communications systems?” IEEE Communications Magazine, vol. 52, no. 9, pp. 88–95,
2014.
[53] N. Omidvar, A. Liu, V. Lau, F. Zhang, and M. R. Pakravan, “Optimal hierarchical radio
resource management for hetnets with flexible backhaul,” IEEE Transactions on Wireless
Communications, vol. 17, no. 7, pp. 4239–4255, 2018.
[54] O. Tipmongkolsilp, S. Zaghloul, and A. Jukan, “The evolution of cellular backhaul tech-
nologies: Current issues and future trends,” IEEE Commun. Surveys & Tutorials, vol. 13,
no. 1, pp. 97–113, 2011.
[55] A. De La Oliva, X. C. Pérez, A. Azcorra, A. Di Giglio, F. Cavaliere, D. Tiegelbekkers,
J. Lessmann, T. Haustein, A. Mourad, and P. Iovanna, “Xhaul: toward an integrated
fronthaul/backhaul architecture in 5G networks,” IEEE Wireless Communications, vol. 22,
no. 5, pp. 32–40, 2015.
[56] M. Jaber, M. A. Imran, R. Tafazolli, and A. Tukmanov, “5G backhaul challenges and
emerging research directions: A survey,” IEEE Access, vol. 4, pp. 1743–1766, 2016.
[57] D. Tse and P. Viswanath, Fundamentals of wireless communication. Cambridge univer-
sity press, 2005.
[58] E. Dahlman, G. Mildh, S. Parkvall, J. Peisa, J. Sachs, Y. Selén, and J. Sköld, “5G wireless
access: requirements and realization,” IEEE Communications Magazine, vol. 52, no. 12,
pp. 42–47, 2014.
[59] P. Rost, A. Banchs, I. Berberana, M. Breitbach, M. Doll, H. Droste, C. Mannweiler, M. A.
Puente, K. Samdanis, and B. Sayadi, “Mobile network architecture evolution toward 5G,”
IEEE Communications Magazine, vol. 54, no. 5, pp. 84–91, 2016.
[60] I. Chih-Lin, S. Han, Z. Xu, S. Wang, Q. Sun, and Y. Chen, “New paradigm of 5G wireless
internet,” IEEE Journal on Selected Areas in Communications, vol. 34, no. 3, pp. 474–482,
2016.
[61] C. Perfecto, J. Del Ser, and M. Bennis, “Millimeter-wave V2V communications: Dis-
tributed association and beam alignment,” IEEE Journal on Selected Areas in Communi-
cations, vol. 35, no. 9, pp. 2148–2162, 2017.
[62] S. Samarakoon, M. Bennis, W. Saad, M. Debbah, and M. Latva-Aho, “Ultra dense small
cell networks: Turning density into energy efficiency,” IEEE Journal on Selected Areas in
Communications, vol. 34, no. 5, pp. 1267–1280, 2016.
[63] K. Son, S. Chong, and G. De Veciana, “Dynamic association for load balancing and in-
terference avoidance in multi-cell networks,” IEEE Transactions on Wireless Communica-
tions, vol. 8, no. 7, 2009.
[64] H. Kim, G. De Veciana, X. Yang, and M. Venkatachalam, “Distributed α-optimal user
association and cell load balancing in wireless networks,” IEEE/ACM Transactions on
Networking, vol. 20, no. 1, pp. 177–190, 2012.
121
[65] Q. Ye, B. Rong, Y. Chen, M. Al-Shalash, C. Caramanis, and J. G. Andrews, “User as-
sociation for load balancing in heterogeneous cellular networks,” IEEE Transactions on
Wireless Communications, vol. 12, no. 6, pp. 2706–2716, 2013.
[66] D. Bethanabhotla, O. Y. Bursalioglu, H. C. Papadopoulos, and G. Caire, “Optimal user-
cell association for massive MIMO wireless networks,” IEEE Transactions on Wireless
Communications, vol. 15, no. 3, pp. 1835–1850, 2016.
[67] J. Andrews, S. Singh, Q. Ye, X. Lin, and H. Dhillon, “An overview of load balancing in
HetNets: Old myths and open problems,” IEEE Wireless Communications, vol. 21, no. 2,
pp. 18–25, 2014.
[68] D. Liu, L. Wang, Y. Chen, M. Elkashlan, K. K. Wong, R. Schobe, and L. Hanzo, “User
association in 5G networks: A survey and an outlook,” IEEE Communications Surveys &
Tutorials, vol. 18, no. 2, pp. 1018–1044, 2016.
[69] S. Hur et al., “Millimeter wave beamforming for wireless backhaul and access in small
cell networks,” IEEE Transactions on Communications, vol. 61, no. 10, pp. 4391–4403,
Oct. 2013.
[70] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale
antenna arrays,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 3, pp.
501–513, 2016.
[71] L. Zhao, D. W. K. Ng, and J. Yuan, “Multi-user precoding and channel estimation for
hybrid millimeter wave systems,” IEEE Journal on Selected Areas in Communications,
vol. 35, no. 7, pp. 1576–1590, 2017.
[72] T. K. Vu, C.-F. Liu, M. Bennis, M. Debbah, M. Latva-aho, and C. S. Hong, “Ultra-reliable
and low latency communication in mmWave-enabled massive MIMO networks,” IEEE
Communications Letters, vol. 21, no. 9, pp. 2041–2044, 2017.
[73] T. K. Vu, M. Bennis, M. Debbah, M. Latva-aho, and C. S. Hong, “Ultra-reliable com-
munication in 5G mmWave networks: A risk-sensitive approach,” IEEE Communications
Letters, vol. 22, no. 4, pp. 708–711, 2018.
[74] T. K. Vu, M. Bennis, S. Samarakoon, M. Debbah, and M. Latva-aho, “Joint in-band back-
hauling and interference mitigation in 5G heterogeneous networks,” Proceedings - 22th
Eur. Wireless Conf., pp. 1–6, 2016.
[75] T. K. Vu, C.-F. Liu, M. Bennis, M. Debbah, and M. Latva-aho, “Path selection and rate
allocation in self-backhauled mmWave networks,” Proceedings - IEEE Wireless Commu-
nications and Networking Conference (WCNC), pp. 1–6, 2018.
[76] L. Georgiadis, M. J. Neely, and L. Tassiulas, “Resource allocation and cross-layer control
in wireless networks,” Foundations and Trends in Networking, vol. 1, no. 1, pp. 1–144,
2006.
[77] M. J. Neely, “Stochastic network optimization with application to communication and
queueing systems,” Synthesis Lectures on Commununication Networks, vol. 3, no. 1, pp.
1–211, 2010.
122
[78] A. Ben-Tal and A. Nemirovski, “On polyhedral approximations of the second-order cone,”
Mathematics of Operations Research, vol. 26, no. 2, pp. 193–205, 2001.
[79] R. Couillet and M. Debbah, Random matrix methods for wireless communications. Cam-
bridge University Press, 2011.
[80] S. Lasaulce and H. Tembine, Game theory and learning for wireless networks: Fundamen-
tals and applications. Academic Press, 2011.
[81] U. Ugurlu, T. Riihonen, and R. Wichman, “Optimized in-band full-duplex mimo relay
under single-stream transmission,” IEEE Transactions on Vehicular Technology, vol. 65,
no. 1, pp. 155–168, 2016.
[82] Z. Jun et al., “Large system analysis of cognitive radio network via partially-projected regu-
larized zero-forcing precoding,” IEEE Transactions on Wireless Communications, vol. 14,
no. 9, pp. 4934–4947, 2015.
[83] H. Boche, S. Naik, and M. Schubert, “Pareto boundary of utility sets for multiuser wireless
systems,” IEEE/ACM Transactions on Networking, vol. 19, no. 2, pp. 589–601, 2011.
[84] Z. Chen, S. Vorobyov, C. Wang, J. Thompson et al., “Pareto region characterization for
rate control in MIMO interference systems and Nash bargaining,” IEEE Transactions on
Automatic Control, vol. 57, no. 12, pp. 3203–3208, 2012.
[85] A. Beck, A. Ben-Tal, and L. Tetruashvili, “A sequential parametric convex approxima-
tion method with applications to nonconvex truss topology design problems,” Journal of
Global Optimization, vol. 47, no. 1, pp. 29–51, 2010.
[86] L. Tran, M. F. Hanif, A. Tölli, and M. Juntti, “Fast converging algorithm for weighted sum
rate maximization in multicell MISO downlink,” IEEE Signal Processing Letters, vol. 19,
no. 12, pp. 872–875, 2012.
[87] H. Li, L. Song, and M. Debbah, “Energy efficiency of large-scale multiple antenna systems
with transmit antenna selection,” IEEE Transactions on Communications, vol. 62, no. 2,
pp. 638–647, 2014.
[88] J. Löfberg, “YALMIP: A toolbox for modeling and optimization in MATLAB,” Proceed-
ings - IEEE International Symposium on Computer Aided Control Systems Design, pp.
284–289, 2004.
[89] K.-C. Toh, M. J. Todd, and R. H. Tütüncü, “SDPT3 - a MATLAB software package for
semidefinite programming, version 1.3,” Optimization Methods and Software, vol. 11, no.
1-4, pp. 545–581, 1999.
[90] A. MOSEK, “The MOSEK optimization toolbox for MATLAB manual, Version 7.1 (Re-
vision 28),” http://mosek. com,(accessed on March 20, 2015), 2015.
[91] J. Mo and J. Walrand, “Fair end-to-end window-based congestion control,” IEEE/ACM
Transactions on Networking, vol. 8, no. 5, pp. 556–567, 2000.
123
[92] A. Roivainen, C. F. Dias, N. Tervo, V. Hovinen, M. Sonkki, and M. Latva-aho, “Geometry-
based stochastic channel model for two-story lobby environment at 10 ghz,” IEEE Trans-
actions on Antennas and Propagation, vol. 64, no. 9, pp. 3990–4003, 2016.
[93] 3GPP, “Evolved Universal Terrestrial Radio Access (E-UTRA); Radio Frequency (RF)
system scenarios,” 3rd Generation Partnership Project (3GPP), Technical Specification
(TS) 36.942, 2014, rel-12.
[94] M. Bennis, S. M. Perlaza, P. Blasco, Z. Han, and H. V. Poor, “Self-organization in small
cell networks: A reinforcement learning approach,” IEEE Transactions on Wireless Com-
munications, vol. 12, no. 7, pp. 3202–3212, 2013.
[95] S. Singh, M. N. Kulkarni, A. Ghosh, and J. G. Andrews, “Tractable model for rate in
self-backhauled millimeter wave cellular networks,” IEEE Journal on Selected Areas in
Communications, vol. 33, no. 10, pp. 2196–2211, 2015.
[96] H. Shokri-Ghadikolaei and C. Fischione, “The transitional behavior of interference in mil-
limeter wave networks and its impact on medium access control,” IEEE Transactions on
Communications, vol. 64, no. 2, pp. 723–740, 2016.
[97] M. Rebato, M. Mezzavilla, S. Rangan, F. Boccardi, and M. Zorzi, “Understanding noise
and interference regimes in 5G millimeter-wave cellular networks,” Proceedings - 22th
European Wireless Conference, pp. 1–5, 2016.
[98] V. Petrov, M. Komarov, D. Moltchanov, J. M. Jornet, and Y. Koucheryavy, “Interference
and SINR in millimeter wave and terahertz communication systems with blocking and
directional antennas,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp.
1791–1808, 2017.
[99] A. Zhou, M. Liu, Z. Li, and E. Dutkiewicz, “Cross-layer design for proportional delay
differentiation and network utility maximization in multi-hop wireless networks,” IEEE
Transactions on Wireless Communications, vol. 11, no. 4, pp. 1446–1455, 2012.
[100] L. X. Bui, R. Srikant, and A. Stolyar, “A novel architecture for reduction of delay and
queueing structure complexity in the back-pressure algorithm,” IEEE/ACM Transactions
on Networking, vol. 19, no. 6, pp. 1597–1609, 2011.
[101] E. Stai and S. Papavassiliou, “User optimal throughput-delay trade-off in multihop net-
works under num framework,” IEEE Communications Letters, vol. 18, no. 11, pp. 1999–
2002, 2014.
[102] A. Zhou, M. Liu, Z. Li, and E. Dutkiewicz, “Joint traffic splitting, rate control, routing, and
scheduling algorithm for maximizing network utility in wireless mesh networks,” IEEE
Transactions on Vehicular Technology, vol. 65, no. 4, pp. 2688–2702, 2016.
[103] J. Garcia-Rois, F. Gomez-Cuba, M. R. Akdeniz, F. J. Gonzalez-Castano, J. C. Burguillo,
S. Rangan, and B. Lorenzo, “On the analysis of scheduling in dynamic duplex multi-
hop mmWave cellular systems,” IEEE Transactions on Wireless Communications, vol. 14,
no. 11, pp. 6028–6042, 2015.
124
[104] G. Narlikar, G. Wilfong, and L. Zhang, “Designing multihop wireless backhaul networks
with delay guarantees,” Wireless Networks, vol. 16, no. 1, pp. 237–254, 2010.
[105] D. Jurca and P. Frossard, “Media flow rate allocation in multipath networks,” IEEE Trans.
Multimedia, vol. 9, no. EPFL-ARTICLE-91033, pp. 1227–1240, 2007.
[106] S. Kompella, S. Mao, Y. T. Hou, and H. D. Sherali, “On path selection and rate allocation
for video in wireless mesh networks,” IEEE/ACM Transactions on Networking, vol. 17,
no. 1, pp. 212–224, 2009.
[107] E. Björnson, L. Sanguinetti, J. Hoydis, and M. Debbah, “Optimal design of energy-
efficient multi-user MIMO systems: Is massive MIMO the answer?” IEEE Transactions
on Wireless Communications, vol. 14, no. 6, pp. 3059–3075, 2015.
[108] B. Sahoo, C.-H. Yao, and H.-Y. Wei, “Millimeter-wave multi-hop wireless backhauling
for 5G cellular networks,” pp. 1–6, June 2017.
[109] G. Yang, M. Haenggi, and M. Xiao, “Traffic allocation for low-latency multi-hop
millimeter-wave networks with buffers,” IEEE Transactions on Communications, 2018.
[110] P. Key, L. Massoulié, and D. Towsley, “Path selection and multipath congestion control,”
Proceedings - 26th IEEE International Conference on Computer Communications (INFO-
COM), pp. 143–151, 2007.
[111] Z. Zhang, X. Chai, K. Long, A. V. Vasilakos, and L. Hanzo, “Full duplex techniques for
5G networks: Self-interference cancellation, protocol design, and relay selection,” IEEE
Communications Magazine, vol. 53, no. 5, pp. 128–137, 2015.
[112] A. Alkhateeb, O. El Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid
precoding for millimeter wave cellular systems,” IEEE Journal of Selected Topics in Signal
Processing, vol. 8, no. 5, pp. 831–846, 2014.
[113] D. H. Nguyen, L. B. Le, and T. Le-Ngoc, “Hybrid MMSE precoding for mmWave mul-
tiuser MIMO systems,” Proceedings - IEEE International Conference on Communications
(ICC), pp. 1–6, 2016.
[114] A. Alkhateeb, G. Leus, and R. W. Heath, “Limited feedback hybrid precoding for multi-
user millimeter wave systems,” IEEE Transactions on Wireless Communications, vol. 14,
no. 11, pp. 6481–6494, 2015.
[115] S. Singh, M. Geraseminko, S.-P. Yeh, N. Himayat, and S. Talwar, “Proportional fair traf-
fic splitting and aggregation in heterogeneous wireless networks,” IEEE Communications
Letters, vol. 20, no. 5, pp. 1010–1013, 2016.
[116] M. Giordani, M. Mezzavilla, S. Rangan, and M. Zorzi, “Multi-connectivity in 5G
mmWave cellular networks,” Proceedings - Mediterranean Ad Hoc Network Workshop
(Med-Hoc-Net), pp. 1–7, 2016.
[117] H. Shokri-Ghadikolaei, L. Gkatzikis, and C. Fischione, “Beam-searching and transmis-
sion scheduling in millimeter wave communications,” Proceedings - IEEE International
Conference on Communications (ICC), pp. 1292–1297, 2015.
125
[118] M. Hussain and N. Michelusi, “Energy-efficient interactive beam-alignment for millimeter-
wave networks,” IEEE Trans. Wireless Commun., vol. 18, no. 2, pp. 838–851, Feb. 2019.
[119] J. Palacios, D. De Donno, and J. Widmer, “Tracking mm-Wave channel dynamics: Fast
beam training strategies under mobility,” Proceedings - 36th Annual IEEE International
Conference on Computer Communications (INFOCOM), pp. 1–9, 2017.
[120] J. Liu and E. S. Bentley, “Hybrid-beamforming-based millimeter-wave cellular network
optimization,” Proceedings - 15th International Symposium on Modeling and Optimiza-
tion in Mobile, Ad Hoc, and Wireless Networks (WiOpt), pp. 1–8, 2017.
[121] O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially sparse precod-
ing in millimeter wave MIMO systems,” IEEE Transactions on Wireless Communications,
vol. 13, no. 3, pp. 1499–1513, 2014.
[122] J. Wildman et al., “On the joint impact of beamwidth and orientation error on throughput in
directional wireless Poisson networks,” IEEE Transactions on Wireless Communications,
vol. 13, no. 12, pp. 7072–7085, 2014.
[123] T. Nitsche, C. Cordeiro, A. B. Flores, E. W. Knightly, E. Perahia, and J. C. Widmer, “IEEE
802.11 ad: directional 60 GHz communication for multi-Gigabit-per-second Wi-Fi,” IEEE
Communications Magazine, vol. 52, no. 12, pp. 132–141, 2014.
[124] T. Baykas, C.-S. Sum, Z. Lan, J. Wang, M. A. Rahman, H. Harada, and S. Kato, “IEEE
802.15. 3c: the first IEEE wireless standard for data rates over 1 Gb/s,” IEEE Communica-
tions Magazine, vol. 49, no. 7, 2011.
[125] J. D. Little and S. C. Graves, “Little’s law.” Springer, 2008, pp. 81–100.
[126] M. S. Elbamby, C. Perfecto, M. Bennis, and K. Doppler, “Toward low-latency and ultra-
reliable virtual reality,” IEEE Networks, vol. 32, no. 2, pp. 78–84, 2018.
[127] A. Mukherjee, “Queue-aware dynamic on/off switching of small cells in dense heteroge-
neous networks,” Proceedings - IEEE Global Communications Conference Workshops, pp.
182–187, Dec. 2013.
[128] S. M. Perlaza, H. Tembine, S. Lasaulce, and M. Debbah, “Quality-of-service provisioning
in decentralized networks: A satisfaction equilibrium approach,” IEEE Journal of Selected
Topics in Signal Processing, vol. 6, no. 2, pp. 104–116, 2012.
[129] S. Samarakoon, M. Bennis, W. Saad, and M. Latva-aho, “Backhaul-aware interference
management in the uplink of wireless small cell networks,” IEEE Transactions on Wireless
Communications, vol. 12, no. 11, pp. 5813–5825, 2013.
[130] S. Singh, T. Jaakkola, M. L. Littman, and C. Szepesvári, “Convergence results for single-
step on-policy reinforcement-learning algorithms,” Machine learning, vol. 38, no. 3, pp.
287–308, 2000.
[131] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge university press, 2004.
126
[132] A. Ben-Tal and A. Nemirovski, Lectures on modern convex optimization: Analysis, algo-
rithms, and engineering applications. SIAM, 2001.
[133] K.-G. Nguyen, L.-N. Tran, O. Tervo, Q.-D. Vu, and M. Juntti, “Achieving energy efficiency
fairness in multicell MISO downlink,” IEEE Communications Letters, vol. 19, no. 8, pp.
1426–1429, 2015.
[134] A. Adhikary, E. Al Safadi, M. K. Samimi, R. Wang, G. Caire, T. S. Rappaport, and A. F.
Molisch, “Joint spatial division and multiplexing for mm-wave channels,” IEEE Journal
on Selected Areas in Communications, vol. 32, no. 6, pp. 1239–1255, 2014.
[135] T. L. Marzetta and B. M. Hochwald, “Fast transfer of channel state information in wireless
systems,” IEEE Transactions on Signal Processing, vol. 54, no. 4, pp. 1268–1278, 2006.
[136] T. Bai, V. Desai, and R. W. Heath, “Millimeter wave cellular channel models for system
evaluation,” Proceedings - IEEE International Conference on Computing, Networking and
Communications (ICNC), pp. 178–182, 2014.
[137] M. Weiner et al., “Design of a low-latency, high-reliability wireless communication sys-
tem for control applications,” Proceedings - IEEE International Conference on Communi-
cations (ICC), pp. 3829–3835, Jun. 2014.
[138] M. N. Kulkarni, E. Visotsky, and J. G. Andrews, “Correction factor for analysis of mimo
wireless networks with highly directional beamforming,” IEEE Wireless Communications
Letters, vol. 7, no. 5, pp. 756–759, 2018.
[139] T. Bai, R. Vaze, and R. Heath, “Analysis of blockage effects on urban cellular networks,”
IEEE Transactions on Wireless Communications, vol. 13, no. 9, pp. 5070–5083, 2014.
[140] E. Björnson, E. G. Larsson, and T. L. Marzetta, “Massive MIMO: Ten myths and one
critical question,” IEEE Communications Magazine, vol. 54, no. 2, pp. 114–123, 2016.
[141] T. H. A. Le and D. T. Pham, “The DC (difference of convex functions) programming and
DCA revisited with DC models of real world nonconvex optimization problems,” Annals
of Operations Research, vol. 133, no. 1, pp. 23–46, 2005.
[142] T. Lipp and S. Boyd, “Variations and extension of the convex–concave procedure,” Opti-
mization and Engineering, pp. 1–25, 2014.
[143] O. Mihatsch and R. Neuneier, “Risk-sensitive reinforcement learning,” Machine Learning,
vol. 49, no. 2-3, pp. 267–290, 2002.
[144] G. Yang, M. Xiao, and H. V. Poor, “Low-latency millimeter-wave communications: Traffic
dispersion or network densification?” IEEE Transactions on Communications, vol. 66,
no. 8, pp. 3526–3539, 2018.
127
128
Appendix 1 Proofs in chapter 3
1.1 Convergence analysis for Algorithm 3.1
Next, we establish a convergence result for Algorithm 3.1 based on the SCA method,
since the original problem (50) has a non-convex objective function (50a) subject to
non-convex constraint (50f). By using the SCA method, we replace the original non-
convex problem (50) by a strongly convex problem (55). We will briefly describe the
convergence here for the sake of completeness since it was studied in [85, 86]. We
assume that the Algorithm 3.1 obtains the solution of problem (55) at iteration i+ 1 th.
The updating rule in Algorithm 3.1 ensures that the optimal values ΛΛΛo(i), δ(i)ks , and ρ
(i)s
at iteration i satisfy all constraints in (55) and are feasible to the optimization problem
at iteration i+ 1. Hence, the objective obtained in the i+ 1st iteration is less than or
equal to that in the in the ith iteration, since we minimize the convex function. In
other words, Algorithm 3.1 yields a non-increasing sequence. Due to antenna and
interference constraints, the objective is bounded, and thus Algorithm 3.1 converges to
some local optimal solution of (55). Moreover, Algorithm 3.1 produces a sequence of
points that are feasible for the original problem (50) and this solution is satisfied the
KKT condition of the original problem (50) as discussed in [85, 86].
1.2 Performance analysis
Theorem 1.1 is provided to show the performance analysis of network utility maximiza-
tion based on Lyapunov framework and prove that the queues are stable.
Theorem 1.1. [Optimality] Assume that all queues are initially empty. For arbitrary
arrival rates, the operation mode and load balancing is chosen to satisfy (49) and the
rate regime. For a given constant χ ≥ 0, the network utility maximization with any
ν > 0 provides the following utility performance with χ − approximation
f0 ≥ f ∗0 − Ψ+ χ
ν,
where f ⋆0 is the optimal network utility over the rate regime.
Proof: To prove the Theorem 1.1, we first prove the queues are bounded. Let πk
denote the largest right derivative of f (rk), the Lyapunov framework can guarantee the
129
following strong stability of the virtual queues and the network queues.
Qk(t)≤ νωk(t)πk + 2amaxk , (116)
Yk(t)≤ νωk(t)πk + amaxk , (117)
Ds(t)≤ νωs(t)πs + amaxs+M. (118)
Here we first prove the bound of the virtual queues, and then the bound of the network
queues are proved similarly. Suppose that all queues are initially empty, this clearly
holds for t = 0. Suppose these inequalities hold for some t > 0, we need to show that it
also holds for t + 1.
From (45) and (47), if Yk(t) ≤ νωk(t)πk and Ds(t) ≤ νωs(t)πs then Yk(t + 1) ≤νωk(t)πk + amax
k and Ds(t + 1) ≤ νωs(t)πs + amaxs+M and the bound holds for t + 1 due
to the arrival rate constraint rk(t) ≤ amaxk and rs(t) ≤ amax
s . Else, if Yk(t) ≥ νωk(t)πk
and Ds(t) ≥ νωs(t)πs; since the value of auxiliary variables is determined by maxi-
mized ∑Kk=1 Yk(t)ϕk(t)+∑S
s=1 Ds(t)ϕs+M(t)−ν f0(ϕϕϕ(t)), ϕϕϕ(t) is then forced to be zero.
From (47) and (45), Yk(t+1) and Ds(t+1) are bounded by Yk(t) and Ds(t), respectively.
Since the virtual queues are bounded for t, we have the following inequalities
Yk(t + 1)≤ Yk(t)≤ νωk(t)πk + amaxk , (119)
Ds(t + 1)≤ Ds(t)≤ νωs(t)πs + amaxs+M. (120)
Hence, the bounds of the virtual queues hold for all t. Similarly, we show that the
network queue (116) holds for all t. It clearly holds for t = 0. We assume that (116)
holds for t > 0, we now prove it holds for t + 1. Note that from (44) and (47) we have
Qk(t + 1) ≤ Hk(t + 1)+ ak(t). Moreover, we just proved that Hk(t + 1) ≤ νωk(t)πk +
amaxk then we have Qk(t +1)≤ νωk(t)πk +2amax
k and the network bound holds for t +1.
We have established the network bounds, we are going to show the utility bound.
Since our solution of (46) is to minimize the Lyapunov drift and the objective function
every time slot t, we have the following inequality
∆(ΞΞΞ(t))−νE[ f0(ϕϕϕ(t))]≤
Ψ−νE[ f0(ϕϕϕ∗(t))]+∑K
k=1 Qk(t)E[
ak(t)− r∗k(t)|ΞΞΞ(t)]
+∑Kk=1 Yk(t)E
[
ϕ∗k (t)− r∗k(t)|ΞΞΞ(t)
]
+∑Ss=1 Ds(t)E
[
ϕ∗s+M(t)−φ (bs)∗(t)rcs∗
s (t)|ΞΞΞ(t)]
,
130
where ϕϕϕ∗(t),φ (bs)∗(t), and r∗k (t) are the optimal values of the problem (49). Since the
queues are bounded, for given χ ≥ 0, obtaining
∆(ΞΞΞ(t))−νE[ f0(ϕϕϕ(t))]≤ Ψ−νE[ f0(ϕϕϕ∗(t))]+ χ .
By taking expectations of both sides of the above inequality and choosing r∗(t) = ϕϕϕ∗(t),
it yields for all t ≥ 0,
E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(t))|ΞΞΞ(t)
]−νE[ f0(ϕϕϕ(t))]≤
Ψ+ χ −νE[ f0(r∗(t))].
By taking the sum over τ = 0, . . . , t−1 and dividing by t, (using the fact that f0(r∗(t)) =
f ∗0 ), yielding
E[L(ΞΞΞ(t + 1))−L(ΞΞΞ(0))|ΞΞΞ(t)
]
t− ν
t
t−1
∑τ=0
E[ f0(ϕϕϕ(t))]≤
Ψ+ χ −ν f ∗0 .
(121)
By using the fact that L(ΞΞΞ(t+1))≥ 0 and L(ΞΞΞ(0))= 0, and applying Jensen’s inequality
in the concave function and rearranging term yields
f0(ϕϕϕ(t))≥ f ∗0 − Ψ+ χ
ν.
Since the network utility function is a non-decreasing concave function, the auxiliary
variable is chosen to satisfy rk(t) ≥ ϕk(t). Hence f0(r(t)) ≥ f0(ϕϕϕ(t)) ≥ f ∗0 − Ψ+χν ,
which means that the solution is closed to the optimal as increasing ν . Which com-
pletes the proof of the Theorem 1.1. Hence, there exists an [O(ν),O(1/ν)] utility-
queue length tradeoff, which leads to an utility-delay balancing.
We now prove that all queues are stable by using the Definition 2.2, the bound (121)
can be rewritten as
∆(ΞΞΞ(t))≤ C,
where C is any constant that satisfies for all t and ΞΞΞ(t): C ≥Ψ+χ−ν( f ∗0 −E[ f0(ϕϕϕ(t))]).
By using the definition of the Lyapunov drift and taking an expectation, obtaining
E[L(ΞΞΞ(t))
]≤ Ct.
As the definition of the Lyapunov function L(ΞΞΞ(t)) we have
E[Qk(t)]2,E[Hk(t)]
2,E[Ds(t)]2 ≤ 2Ct.
131
Dividing both sides by t2, and taking the square roots shows for all t > 0:
E[Qk(t)]
t,E[Hk(t)]
t,E[Dk(t)]
t≤√
2C
t.
As t → ∞, taking the limit, we prove the queues are stable.
132
A C T A U N I V E R S I T A T I S O U L U E N S I S
Book orders:Granum: Virtual book storehttp://granum.uta.fi/granum/
S E R I E S C T E C H N I C A
686. Silvola, Risto (2018) One product data for integrated business processes
687. Hildebrandt, Nils Christoph (2018) Paper-based composites via the partialdissolution route with NaOH/urea
688. El Assal, Zouhair (2018) Synthesis and characterization of catalysts for the totaloxidation of chlorinated volatile organic compounds
689. Akanegbu, Justice Orazulukwe (2018) Development of a precipitation index-based conceptual model to overcome sparse data barriers in runoff prediction incold climate
690. Niva, Laura (2018) Self-optimizing control of oxy-combustion in circulatingfluidized bed boilers
691. Alavesa, Paula (2018) Playful appropriations of hybrid space : combining virtualand physical environments in urban pervasive games
692. Sethi, Jatin (2018) Cellulose nanopapers with improved preparation time,mechanical properties, and water resistance
693. Sanguanpuak, Tachporn (2019) Radio resource sharing with edge caching formulti-operator in large cellular networks
694. Hintikka, Mikko (2019) Integrated CMOS receiver techniques for sub-ns basedpulsed time-of-flight laser rangefinding
695. Järvenpää, Antti (2019) Microstructures, mechanical stability and strength of low-temperature reversion-treated AISI 301LN stainless steel under monotonic anddynamic loading
696. Klakegg, Simon (2019) Enabling awareness in nursing homes with mobile healthtechnologies
697. Goldmann Valdés, Werner Marcelo (2019) Valorization of pine kraft lignin byfractionation and partial depolymerization
698. Mekonnen, Tenager (2019) Efficient resource management in Multimedia Internetof Things
699. Liu, Xin (2019) Human motion detection and gesture recognition using computervision methods
700. Varghese, Jobin (2019) MoO3, PZ29 and TiO2 based ultra-low fabricationtemperature glass-ceramics for future microelectronic devices
701. Koivupalo, Maarit (2019) Health and safety management in a global steel companyand in shared workplaces : Case description and development needs
C703etukansi.fm Page 2 Friday, April 5, 2019 10:48 AM
UNIVERSITY OF OULU P .O. Box 8000 F I -90014 UNIVERSITY OF OULU FINLAND
A C T A U N I V E R S I T A T I S O U L U E N S I S
University Lecturer Tuomo Glumoff
University Lecturer Santeri Palviainen
Senior research fellow Jari Juuti
Professor Olli Vuolteenaho
University Lecturer Veli-Matti Ulvinen
Planning Director Pertti Tikkanen
Professor Jari Juga
University Lecturer Anu Soikkeli
Professor Olli Vuolteenaho
Publications Editor Kirsti Nurkkala
ISBN 978-952-62-2242-4 (Paperback)ISBN 978-952-62-2243-1 (PDF)ISSN 0355-3213 (Print)ISSN 1796-2226 (Online)
U N I V E R S I TAT I S O U L U E N S I SACTAC
TECHNICA
U N I V E R S I TAT I S O U L U E N S I SACTAC
TECHNICA
OULU 2019
C 703
Kien Vu
INTEGRATED ACCESS-BACKHAUL FOR 5G WIRELESS NETWORKS
UNIVERSITY OF OULU GRADUATE SCHOOL;UNIVERSITY OF OULU,FACULTY OF INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING;CENTRE FOR WIRELESS COMMUNICATIONS
C 703
AC
TAK
ien VuC703etukansi.fm Page 1 Friday, April 5, 2019 10:48 AM