massive mimo channel prediction using recurrent neural

Received 15 Jul. 2020, Accepted 10 Aug. 2020, Date of publication 12 Aug., 2020, date of current version 12 Aug. 2020.

Digital Object Identifier (DOI): 10.46470/03d8ffbd.80623473

Massive MIMO Channel Prediction UsingRecurrent Neural NetworksJOEL PONCHA LEMAYIAN1, JEHAD M. HAMAMREH21Department of Electrical and Computer Engineering, Antalya Bilim University,Antalya, Turkey (e-mail: [email protected]).2Department of Electrical and Electronics Engineering, Antalya Bilim University,Antalya, Turkey (e-mail: [email protected]).Authors are from WISLAB for Innovations in Wireless Communication Computation laboratory.

Corresponding author: Joel P. Lemayian (e-mail: [email protected]).

This work is funded by the scientific and technological research council of Turkey (TÜBITAK) under grand 119E392. All the codes usedcan be found at researcherstore.com.

ABSTRACT Massive MIMO has been classified as one of the high potential wireless communicationtechnologies due to its unique abilities such as high user capacity, increased spectral density, anddiversity among others. Due to the exponential increase of connected devices, these properties areof great importance for the current 5G-IoT era and future telecommunication networks. However,outdated channel state information (CSI) caused by the variations in the channel response dueto the presence of highly mobile and rich scattering is a major problem facing massive MIMOsystems. Outdated CSI occurs when the information obtained about the channel at the transmitterchanges before transmission. This leads to performance degradation of the network. In this work,we demonstrate a low complexity channel prediction method using neural networks. Specifically,we explore the power of recurrent neural network utilizing long-short memory cells in analyzingtime series data. We review various neural network-based channel prediction methods available inthe literature and compare complexity and performance metrics. Results indicate that the proposedmethods outperform conventional systems by tremendously lowering the complexity associated withchannel prediction.

INDEX TERMS Massive MIMO, Neural networks, RNN, Artificial Intelligence, Channel stateinformation.

I. INTRODUCTION

Multiple input multiple output (MIMO) employsmultiple antennas at the transmitter and/or re-

ceiver. This technology has highly desired propertiessuch as high throughput, high spectral efficiency, andmultiplexing gains [1]. MIMO has evolved from a mereresearch concept to a real-world application and has beenintegrated into state-of-the-art wireless network stan-dards such as IEEE 802.11n, 3GPP long-term evolution(LTE) and LTE- Advanced (E-UTRA) [3]. In a massiveMIMO (mMIMO) system, the number of antennas inMIMO increases to hundreds. mMIMO has been classifiedas one of the high potential wireless communicationtechnologies with the ability to have high user capacity[4], which is a key requirement for 5G-IoT and beyondtechnologies [6] [7] [16].Nevertheless, numerous challenges are facing mMIMOand the general wireless communication paradigms. This

is due to the population increase of users which causesincreases in the number of connected devices [8] [9] [10][11]. Outdated channel state information (CSI) causedby the variations in the channel response due to thepresence of highly mobile and rich scattering is a majorproblem facing mMIMO systems [4]. Outdated CSI oc-curs when the information obtained about the channelchanges before transmission. Proposed solutions in theliterature to combat the outdated CSI problem are mainlydivided into passive and sub-optimal methods [12]. Pas-sive method passively compensates for the performanceloss at the cost of wireless resources (frequency, time,power), while sub-optimal methods assumes imperfectCSI as a constraint and aims to acquire only partialperformance [13].For example, in time division duplex (TDD) systems thechannel is reciprocal, where the same frequency is usedfor both uplink and downlink. Therefore the downlink

1

https://sites.google.com/view/wislab/home

https://researcherstore.com/product/massive-mimo-channel-prediction-using-neural-networks-codes/

Poncha et al.: mMIMO Channel Prediction Using Neural Networks

channel is estimated from the uplink channel in conven-tional TDD systems [19]. Nevertheless, the increase inthe number of devices caused by the exponential humanpopulation increase, and the IoT era where everythingis connected to everything has forced engineers to useultra-high bands including millimeter and terahertz inwireless communication [5]. Channel coherence time issignificantly reduced in ultra-high bands thus becomingshorter than the pilot transmission time. Consequently,the conventional uplink-downlink channel state acquisi-tion method will provide an outdated CSI resulting insignificant system performance degradation. In light ofthis, an efficient and reliable channel state predictionsystem is in demand.Moreover, considering channel modeling in mMIMOchannel prediction, Rayleigh distance between the trans-mitter and receiver in MIMO is defined as 2L2/λ, where Lis the antenna dimension and λ is the carrier wavelength[15]. Unlike MIMO, mMIMO has a large number ofantennas, which may cause the distance between thereceiver and transmitter to be shorter than Rayleighdistance. Therefore, the far-field assumption defined in[20] cannot be used and in turn, we use the near fieldassumption defined in [21]. Moreover, non-stationarity isalso considered in mMIMO due to the changing antennasand the varying physical environment [20] [21].Neural Networks (NN) as an AI technique, is an effec-tive recently proposed model for combating outdatedCSI without wasting resources [14]. It is highly valuedbecause it can avoid parameter estimation due to its data-driven nature. Channel prediction is viewed as a revo-lutionary future technology and hence it has attractedthe attention of many researchers [4]. Additionally, in-stantaneously selecting transmit parameters (enabled bychannel prediction) such as transmit power, coding rate,transmit antennas, and carrier frequency, depending onthe instantaneous condition of the channel using channelprediction will tremendously improve the performanceof adaptive wireless communications systems (Which arethe future of effective wireless communication).In this work, we concentrate on implementing channelstate prediction in mMIMO using artificial intelligence,specifically recurrent neural networks (RNNs). Fig. 1depicts a mMIMO BS with hundreds of antennas andmultiple users.RNNs are very effective when processing time series data.Since channel response data is closely related to timeseries data, we look at mMIMO channel prediction usingRNNs as a technology with great future potential that willhave a major impact in wireless technology. Moreover,we compare performance metrics of conventional CSIprediction processes with RNN-based prediction. lastly,an RNN model utilizing LSTMs is designed. The maincontributions of this work are:

1) Propose a low cost mMIMO RNN-based CSI predic-

tor.2) Provide quick answers about RNN-based mMIMO

CSI prediction.3) Demonstrate performance metrics between con-

ventional CSI predictors and RNN-based predictors,in terms of complexity and cost.

4) Develop a mMIMO channel prediction method for128 transmit antennas.

5) Review recent and common neural network channelprediction schemes.

The remainder of this work is organized as follows.In Section II we discuss the literature review relatedto this topic. In Section III we present the mMIMOsystem model. Section IV discusses about commonly usedmachine learning MIMO channel prediction models andtheir limitations, while Section V explains about RNN-based mMIMO channel prediction process using theRNN predictor. Section IV discusses about the mMIMOchannel prediction process. In Section VII we discuss thesimulation results and finally, the conclusion is presentedin Section VIII.

TABLE 1. NOMENCLATURES

Abbreviations Definitions

ACM adaptive coded modulationAI artificial intelligenceAOA angle of arrivalAOD angle of departureAP access pointBER bit error rateBP back-propagationBS Base StationCNN convolutional neural networksCSI channel state informationDL downlinkESPRIT estimation of signal parameters via

rotational invariant techniquesFDD frequency division duplexIEP interval of effective predictionLSTM long short-term memoryLTE long-term evolutionMIMO multiple input multiple OutputMUSIC multiple signal classificationNMSE normalized mean square errorOFDM orthogonal frequency division multiplexingRNN recurrent neural networkSCNet sparse complex-valued neural networkTDD time division duplexUL uplinkULA uniform liner array

II. RELATED WORKSAccording to [18], frequency division duplexing (FDD)based networks is superior compared to TDD-based net-works due to its low latency and high anti-interferenceproperties. Nevertheless, its computation and feedbackchallenges for predicting downlink channel state infor-mation (DL-CSI) are the main constraints for advancingperformance in FDD cellular networks. To solve thisproblem, the authors in [18] propose the use of deep

2


Transmitters

Radio signals

Receiver A

Receiver B

Receiver C

Receiver D

FIGURE 1. Massive multiple input multiple output (mMIMO) model.

learning convolutional Long Short-Term Memory network(convLSTM-net) to predict DL-CSI using uplink channelstate information (UL-CSI). The proposed convLTSM-netconsists of two modules. The first one is the featureextraction module, responsible for learning spatial andtemporal correlations between DL-CSI and UL-CSI. Thesecond part is the prediction model which maps theextracted features to the reconstruction of DL-CSI. More-over, the authors compare the performance of convolu-tional neural networks (CNN) and long short-term mem-ory networks (LSTM) to convLSTM-net. The performancesimulation was further divided into two parts. In part one,the hyperparameters on the proposed convLSTM-net areanalyzed to investigate their effects on the predictionperformance. In part two the experiment is performed inboth time and frequency domain to determine the properenvironment for accurate prediction of DL-CSI. Accordingto Wang et. al. [18], the results show that convLSTM-netoutperforms both CNN and LSTM in DL-CSI predictionusing UL-CSI in cellular FDD networks. However, theproposed convLSTM-net has high complexity evidentfrom its design.In addition, authors in [22] claim that in an FDD mMIMOsystem, the acquisition of downlink CSI is a complextask due to the overheads required for downlink train-ing and uplink feedback. The authors propose a sparsecomplex-valued neural network (SCNet) system used tomap uplink to downlink. Unlike the previous system, thisparadigm is modeled in the complex domain and canlearn the complex-valued mapping function by off-linetraining. According to the authors, numerical results sug-gest that SCNet outperforms conventional deep network-based channel prediction in terms of prediction accuracyand robustness over complex wireless channels.Authors in [28] propose an ESPRIT-based parameterprediction model for narrow-band MIMO systems thatexploit both temporal and spatial correlations in practicalMIMO channels. The model estimates channel param-eters by employing a vector transmit spatial signature

model and two-dimensional ESPRIT. According to theauthors, the proposed scheme is suitable for both two-dimensional azimuths only and three-dimensional MIMOspatial channel models.Authors in [23] utilize back-propagation (BP) in a mul-tilayered neural network to model a multi-time channelprediction system. The paradigm is used to effectivelypredict CSI and enhance mMIMO performance, powercontrol, and artificial noise physical layer security design.Additionally, the authors utilize a previous stopping cri-terion to prevent over-fitting of BP in neural networks.According to the authors, the results demonstrated bycomparing the predicted normalized mean square error(NMSE), indicate that the performance of the proposedmodel has improved. Additionally, a sparse channel con-struction model used to save system resources withoutdeteriorating the performance is proposed.According to [24], a mMIMO channel is characterized bynon-stationarity and quick variations. Therefore, usingconventional methods to obtain CSI will result in anoutdated CSI and consequently degrading the perfor-mance of the network. Authors in [24] propose a channelprediction algorithm used in massive MIMO. The au-thors propose a first-order Taylor expansion-based chan-nel prediction model to handle channel characteristics.Moreover, a channel prediction model with estimationand prediction sections is further proposed to derivean interval of effective predictions (IEP). According tothe authors, numerical simulations from the proposedalgorithm indicate that a reliable channel prediction canbe achieved within IEP.By matching time-varying wireless fading channels totransmitter parameters, known as adaptive modulation,system throughput is considerably improved. Authors in[25] propose a channel prediction scheme using pilotsymbol assisted modulation for MIMO Rayleigh fadingchannels. Moreover, the effect of the channel predictionerror on the bit error rate of a transmit beam-former isanalyzed. According to [25], the results obtained indicatea critical value, under which the adaptive modulator canconsider the predicted channel as perfect, and abovewhich categorical attention of the channel’s imperfectionmust be accounted for at the transmitter.Authors in [19] propose a split-brain autoencoder system,which is a modified regular autoencoder for learning. Theparadigm divides the network into two disjoint parts.Each part performs complex channel prediction tasks.According to the authors, the method produced state-of-the-art results on several large-scale learning standards.Moreover, authors in [1] provide a comprehensive surveyon the application of recurrent neural networks (RNN) inchannel prediction. The complexity and performance ofpredictors are relatively illustrated by numerical results.Adaptive coded modulation (ACM) is a promising methodused to enhance spectral efficiency in a time-varyingmobile channel with no effect on the targeted bit error

VOLUME 4, 2016 3


rate (BER) [26]. However, the transmitter must have per-fect up-to-date channel characteristics for ACM to work.Authors in [26] propose a linear fading-envelope predictorto predict the CSI. Moreover, authors in [27] proposea CSI prediction algorithm for OFDM that determinestime-delays and Doppler frequencies of each propagationpath. According to the authors, the method requires lessfeedback information and has better mean-squared errorperformance than previous methods.The model proposed in this work differs from the dis-cussed reviews in that the model strives to provide alow-cost low-complex CSI predictor with the ability tosuport future wireless technologies such as mMIMO andmm-Wave. Key contributions are:

1) Propose a low-complexity low-cost mMIMO RNN-based CSI predictor.

2) Demonstrate performance metrics between con-ventional CSI predictors and RNN-based predictors,in terms of complexity and cost

3) Develope and RNN-based predictor for mMIMO.

In the next section, we will discuss the mMIMO systemmodel.

III. mMIMO CHANNEL PREDICTION SYSTEM MODELIn a time-varying mMIMO system, there are MT trans-mitter antennas and MR receiver antennas. Where, at anyinstance MT ≤ MR [24]. Each antenna transmits N timeslots at a given transmission, where N ≥ MT . Consideringan instantaneous signal transmit and receive in mMIMO,the base-band receiver can be shown as:

r(t ) = H(t )s(t )+n(t ), (1)

where r(t ) = [r1(t ), . . . ,rNr (t )]T is an Nr ×1 vector of thereceive signal at time t (Nr is the number of receiveantennas) and s(t ) = [s1(t ), . . . , sNr (t )]T is an Nt ×1 vectorof the transmit signal at time t (Nt is the number oftransmit antennas). H(t ) = [hnr nt (t )]Nr ×Nt is the matrix ofcontinues channel impulse response and hnr nt ∈ C1×1 isthe flat fading channel gain between transmitter antennant and receiver antenna nr . Moreover, 1 ≤ nr ≤ Nr and1 ≤ nt ≤ Nt . Due to the multipath fading, feedback andprocessing delays the obtained CSI at the transmitter maybe outdated before it can be used. That is, H(t ) 6= H(t+τ),consequently, this will lead to performance degradationof adaptive communication systems [3]. The goal ofchannel prediction is to estimate H(t + τ) at time t tobe as close as possible to the actual value at (t +τ). Thatis H(t + τ) → H(t + τ). Closely related MIMO predictiontechniques are briefly discussed in the subsequent sec-tion.

IV. RELATED PREDICTOR MODELSApart from RNN-base predictors, several other methodshave been proposed for mMIMO CSI prediction. Paramet-ric and autoregressive channel prediction models are the

most popular techniques [1]. In this section, the modelsare briefly discussed to provide the reader with a cleardistinction to the proposed work.

A. PARAMETRIC PREDICTOR MODELAs stated by [16], [17], a single antenna channel isrepresented by overlaying a set of complex sinusoids inpopular multipath fading models.

h(t ) =P∑

p=1αp e j (ωp t+φp ), (2)

where αp is the complex amplitude, ωp is the Dopplerfrequency shift in radians of the p th sinusoid, and φp

is the phase. j 2 = −1 denotes complex units and Prepresents the total number of scattered sinusoids. Thesingle-antenna system depicted in equation (2) can bemodeled to represent a MIMO propagation model shownby equation (3), by introducing spatial dimension param-eters.

H(t ) =P∑

p=1αp ar(θp )aT

t (ψp )e j (ωp t+φp ), (3)

where θp and ψp are the angle of arrival (AOA) andangle of departure (AOD) respectively. ar and at are theresponse vector of the receiver and transmitter antennaarrays respectively. a can be represented as a uniform lin-ear array (ULA) with M equally spaced antenna elementsas follows:

a(x) = [1,e− j ( 2πλ

)d si n(x), ...,e− j ( 2πλ

)(M−1)d si n(x)]T , (4)

where x can either be the angle of arrival or departure,λ is the wavelength of the sub-carrier frequency, andd is the distance between antennas. According to [1],multipath parameters change slowly compared to thechannel fading rate. Therefore, future CSI up to a cer-tain period can be obtained by simply extrapolating theknown multipath parameters. Hence, channel predictionin mMIMO using equation (13) is reduced to parameterprediction. That is, a parameter prediction model topredict the total number of scatters, the angle of arrivaland departure, and the Doppler shift for each path (i.e.{αp ,ωp , θp ,ψp }p

p=1 ). In the following section, we modela prediction procedure for the mentioned parameters.

1) parametric model prediction procedure1) Step 1:

We define k independent discrete-time channels{H[k]|k = 1, . . . ,K }, sampled from the continuouschannel response H(t ). We can therefore model alarge matrix containing all the required translationalinvariance structure in all dimensions. Accordingto [28], a block-Hankel matrix with dimensionsNr Q ×Nt S is represented as follows:

4


D =

H[1] H[2] ... H[S]H[2] H[3] ... H[S +1]

. . . .

. . . .H[Q] H[Q +1] ... H[K ]

, (5)

where Q is the size of the Hankel matrix and S =K −Q+1. We use equation (5) to calculate a Spatio-temporal covariance matrix C.

C = DDH

Nt S, (6)

where (·)H represents the Hermitian conjugatetranspose.

2) Step 2:We estimate the dominant scattering sources P us-ing the minimum description length (MDL) systemdescribed by [29].

P = ar g mi nz=1,...,(Nr Q−1)[Sl og (λz )+ (z2 + z)log (S)

z], (7)

where λz represents the z th eigenvalue of C.3) Step 3:

Algorithms such as multiple signal classification(MUSIC) and estimation of signal parameters byrotational invariance techniques (ESPRIT) to find{ωp , θp ,ψp }p

p=1 from C.4) Step 4:

Now that we have {ωp , θp ,ψp }pp=1, we calculate

{αp }pp=1 by substituting the former parameters to

equation (13) and obtain the equation below.

H(τ) =P∑

p=1αp ar(θp )aττ(ψp )e j (ωpτ+φp ), (8)

where τ is the time steps predicted.

From the process described above, it is evident thatthis model experiences some constraints. The estimationprocess is tedious and has high complexity due to themanipulation of high order matrices. Moreover, equations(3) and (4) are highly dependent on the type of arrayused. That is, if a different kind of array is used thenequations (3) and (4) must be adjusted accordingly. Fi-nally, the obtained prediction outdates faster especiallyin a continuously changing environment compared to aconstant environment.

B. AUTOREGRESSIVE (AR) PREDICTOR MODELThe mMIMO time varying channel can alternativelybe formulated using an autoregressive process whereKalman filters (KF) are used to compute AR coefficientsused to build a liner predictor used to predict futureCSI using current and past CSI [1]. the AR predictor formMIMO can be represented as:

H(t −1) =P∑

p=1Ap ©H[t −p +1], (9)

where Ap = {apnr nt

} is an nr ×nt AR coefficient matrix,such that ap

nr ntis the p th AR coefficient of the channel

between transmitter nt and receiver nr . Other predictorsproposed in literature include maximum-likelihood (ML)estimation, least-square (LS) estimation, and minimum-mean-square-error (MMSE) estimation

CSI prediction using the above models is faced by anumber of challenges such as high complexity, low accu-racy, lack of generality, single-step prediction limitation,and unreliability, hence such methods are only suitablefor small scale estimation [2]. CSI prediction process inRNNs is summarised in the subsequent section.

V. Proposed method: RECURRENT NEURALNETWORKS (RNNS)In this section, we will discuss the RNN model which isa powerful machine learning technique that has showngreat potential in predicting time series data. RNNs aresuperior because they not only use training data forlearning but also learn from historical data of past events.There are different models of RNNs, Fig. 3 depicts thecommonly known Jordan network. A simplified RNNnetwork consists of an input layer with Ni neurons, ahidden layer with Nh neurons, and an output layer withNo outputs. Each connection between the input layer, thehidden layer, and the output layer is assigned a weightedvalue. Let wln denote the weight between the l th inputand the nth hidden neuron, and vol represent the weightof the l th hidden neuron and the oth output neuron. Suchthat 1 ≤ n ≤ Ni , 1 ≤ l ≤ Nh , and 1 ≤ o ≤ No . Therefore, wecan construct an Nh ×Ni matrix W of weights shown by(10).

W =

w11 ... w1Ni

. . .

. . .wNh 1 ... wNh Ni

. (10)

The input activation vector is denoted as x(t ) =[x1(t ), . . . , x(Ni )(t )]T while the recurrent or feedback com-ponent is modeled as: f(t ) = [ f1(t ), . . . , f(Ni )(t )]T . There-fore, the input to the hidden layer can be represented bythe following equation:

zh(t ) = Wx(t )+ f(t )+bh , (11)

where bh = [bh1 , . . . ,bh

(Nh )]T represents the bias in the

hidden layer. In addition, we use a matrix F to map theprevious output y(t −1) = [y1(t −1), . . . , yNo (t −1)]T to thecurrent input component. Hence,

f(t ) = Fy(t −1). (12)

The behavior of a neuron network is determined by anactivation function (AF). The activation function intro-duces some nonlinearity characteristics to the system,

5


∑

AF

Tangent

Rectified linear

Sigmoid

W

f(t)bh

Threshold

h(t)

Linear

FIGURE 2. Activation functions for a neural network designed to introducenon-linearity.

which means the output is not simply a constant scalingof the input (i.e. the rate of change is not constantacross all independent variables). Hence enabling it tosolve complex problems. Common activation functionsinclude liner, rectified liner hyperbolic tangent, and leakyrectified liner (see Fig. 2). In this work, we use the sigmoidfunction to introduce nonlinearity. The sigmoid functionis defined as:

S(x) = 1

1+ex . (13)

Substituting equations (11) and (21) into equation (13),we get the following equation:

h(t ) = S(zh(t )) = S(wx(t )+Fy(t −1)+bh). (14)

zh(t ) is of size Nh × 1, therefore, equation (14) is ex-ecuted in an element wise operation. That is, S(Zh) =[S(z1), . . . ,S(zNh )]T . A matrix V of weights with dimen-sions No ×Nh exists between the hidden and the outputlayer. The input to the output layer therefore becomes,zo(t ) = Vh(t ) + by where by is the bias matrix for theoutput layer of size No×1. The output equation is derivedas:

y(t ) = S(zo(t )) = S(Vh(t )+by ). (15)

However, RNNs suffers from vanishing and explodinggradient problems. The vanishing gradient problem isexperienced during back-propagation. This is where thepartial derivative of the loss function with respect tothe current weight progressively diminishes during back-propagation and hence has no effect on the weights whenperforming gradient descent. On the other hand, theexploding gradient is experienced when large gradienterrors accumulate causing large updates on the networkweights during training. LSTM (long short term memory)networks are special types of RNNs that use gates toovercome the problems experienced by conventionalRNNs. The gates in LSTM networks include input, output,and forget gates, they facilitate better control of gradientflow and prevention of long-range dependencies. In thiswork, we utilize LSTMs as shown in Fig. 4. Equations that

describe the functionality of an LSTM unit cell are givenbelow.

it =σ(W (i )xt +U (i )ht−1), (16)

ft =σ(W ( f )xt +U ( f )ht−1), (17)

ot =σ(W (o)xt +U (o)ht−1), (18)

ct = t anh(W (c)xt +U (c)ht−1), (19)

ct = ft © ct−1 + it © ct , (20)

ht = ot © t anh(ct ). (21)

© denotes element-wise multiplication. W denotes therecurrent connection between the previous and the cur-rent hidden layers. U is the weight matrix between thecurrent input and the hidden layer. C is calculated basedon the current input and the previous hidden state andis a hidden state candidate. C is the internal memoryof the cell, which is the sum of pr evi ous_memor y ©f or g et_g ate and newl y_computed_hi dden_st ate ©i nput_g ate.The first operation done in an LSTM layer is to decidewhether the information is kept or discarded. This op-eration is done in the forget gate. A number between 0and 1 is produced, where 1 means keep the information,while 0 means completely forget. The next step is todecide which new information will be kept in the cellstate. This process is done in two parts, first, a sigmoidfunction called the input gate determines which valueswill be updated, then a tanh layer determines candidatecell states to be added. Next, the previous cell state Ct−1

needs to be updated, that is Ct = ft ×Ct−1+it ×Ct . whereft makes the old state forget information discarded in theforget gate. Finally, the output of the cell is modeled bythe output gate which is a sigmoid layer that determineswhich part of the cell state will be output. The cell state ispassed through a layer of tanh which forces the cell stateto be between -1 and 1, then the output of the outputgate is multiplied by the output of the tanh layer to getthe output of the cell.The use of the forget gate in an LSTM cell helps it todecide which information needs to be discarded andwhich information is to be used to update the modelsparameters at each time step. Hence, this helps the modelprevent against vanishing and exploding gradient. Tounderstand this better, lets say that the error vanishinggradient ∂E with respect to some weight W at some timestep k < T is:

k∑t=1

∂Et

∂W→ 0, (22)

then for the gradient not to vanish, a suitable parameteris found for the next time step such that:

∂Ek+1

∂W! → 0. (23)

6


Input layer

Hidden layerOutput layer

VW

… …

…

…… [F]

x y

RNNCELL

x

y

f

𝒁! 𝑡 𝒁" 𝑡

FIGURE 3. Conventional recurrent neural network system.

The presence of the activation vector in the forget gateallows it to find such a parameter.The data set used is divided into two, training and predic-tion data sets. The training data set is used to train themodel through backpropagation (BP). The model takesin the input x and the preferred output y then calculatesthe cost C = ||y−x||2. The error is then propagated backthrough the network, causing the weights to iterativelyadjust until convergence is obtained and a minimum costis achieved. To commence prediction using LSTMs, theinitialization process is described below.

1) LSTM INITIALIZATION AND PREDICTION PROCEDURE1) Randomly select W,V,bh,by.2) Define sequence length. This is the number of

historical data considered when performing theprediction.

3) Set feature period predict. This defines the timeunits predicted into the future.

4) Set epoch. This is the number of times that entiredata points are passed through the network.

5) Set batch size. Number of grouped samples fromthe total samples processed at a time.

6) Preprocess the data. Data is scaled and cleaned, i.e.NANs and empty rows are removed.

7) Split data into validation and train data sets.8) Create an LSTM model.

VI. mMIMO CSI PREDICTION PROCESS USING RNNSThe data set h is divided into two, training (60%) andtesting/prediction (40%) data sets. The training data setis used to train the model through backpropagation (BP).Referring to Fig. 4, the model takes in the input x = h(t )and the preferred output y = h(t+D), where D is the stepsto be predicted into the future. The model then calculatesthe cost C, using mean squared error (MSE):

C = 1

T

T∑t=1

||h(t +D)−h(t +D)||2, (24)

T is the is the total number of channel samples. The erroris then propagated back through the network, causingthe weights to iteratively adjust through gradient descentuntil convergence is obtained and a minimum cost isachieved. Initial parameters such as Ct−1 and ht−1 aregenerally initialised with zero values, while the weightsare randomly initialized. The testing set is then used tomeasure the accuracy of the network. A 2-layer mMIMORNN-based model predictor utilizing LSTM was trainedusing 200,000 samples of h from 128 antenna generatedwith Rayleigh distribution. In this simulation, Python 3.0was used and the properties of Keras used as a deeplearning framework were version 2.3.1, Keras-Applications1.0.8, and Keras-Preprocessing 1.1.0. A summary table ofthe utilized network indicating the number of parametersand the network topology is depicted in Table 2. Adamoptimizer was employed with an initial learning rate of0.01 and a batch size 0f 256. Moreover, we use dropoutregularization to prevent overfitting.

TABLE 2. Proposed LSTM network properties

Layer (type) Output Shape Param No.

lstm 10 (LSTM) (None, 128, 32) 4480dropout 1 (Dropout) (None, 128, 32) 0lstm 11 (LSTM) (None, 16) 3136dropout 2 (Dropout) (None, 16) 0dense 3 (Dense) (None, 64) 1088

Total params : 8,704Trainable params : 8,704Non-trainable params : 0

VII. RESULTSFig. 5 indicates characteristics of the mMIMO channel,where a and b are real and imaginary parts of the channelrespectively). As it can be observed, the channel is nearto a random signal due to the random nature of theenvironment. However, we can also observe that the datais a time series data with some repetition. Fig. 6 shows

VOLUME 4, 2016 7


x

Forget Gate

Input Modulation Gate

Input Gate Output Gate

x + xtanh

𝝈

𝝈 𝝈

xt

ht-1

Ct-1

ft

tanh!𝒄𝒕

it

Ct

ht

ot

FIGURE 4. Long Short Term Memory (LSTM) unit cell.

historical and true future real part of the channel dataused to train the model. Similarly, the imaginary partwith the same historical and future window is fed intothe model for training. Fig. 7 shows the validation and

Samples

Samples

Magnitude

Magnitude

FIGURE 5. Real and imaginary parts of the channel (a is real, b is imaginary).

Samples

Magnitude

FIGURE 6. History and future true channel data for massive MIMO.

training loss after 10 epochs with a batch of 500. Itcan be observed that both training loss and validationloss gradually converge. This indicates that the model isneither over-fitting nor under-fitting due to the use ofdropout layers in the model as it can be observed fromTable 2. In this case, the accuracy of the model can be

improved by increasing the number of epoch since thelosses are still on a downwards trend. Moreover, Fig. 8 and

FIGURE 7. mMIMO training and validation loss.

9 shows the prediction of the real and imaginary chan-nel parts respectively of the channel. The results showthat the model can estimate the channel’s characteristicswith minimum complexity compared to other methodsmentioned in section IV. It can also be observed fromthe figure that the accuracy of the predictor is not 100%.To improve the accuracy of the predictor, training dataand iterations can be increased as well as LSTM layers orusing different regularizers.

Magnitude

Time

FIGURE 8. Real part prediction of the massive MIMO channel.

8 VOLUME 4, 2016


TimeSamples

Magnitude

FIGURE 9. Imaginary part prediction of the massive MIMO channel.

VIII. CONCLUSIONConsidering that mMIMO is a technology that has showngreat potential in enhancing wireless communicationin the future, this work has demonstrated that RNN-based CSI prediction is the ideal technology to boost theperformance of mMIMO systems by lowering complexityand increasing accuracy. Therefore, we intent to furtherthis work by perfecting mMIMO CSI prediction usingdifferent configurations of RNNs utilizing LSTMS or GRUsand improving the accuracy of the designed predictor.

REFERENCES[1] Jiang W, Schotten HD. "Neural Network-Based Fading Channel Predic-

tion: A Comprehensive Overview." IEEE Access 7 (2019): 118112-118124.[2] Luo C, Ji J, Wang Q, Chen X, Li P. "Channel state information predic-

tion for 5G wireless communications: A deep learning approach." IEEETransactions on Network Science and Engineering (2018).

[3] Zheng J, Rao BD. "Capacity analysis of MIMO systems using limitedfeedback transmit precoding schemes." IEEE Transactions on SignalProcessing 56, no. 7 (2008): 2886-2901.

[4] Jiang W, Schotten HD. "Deep Learning for Fading Channel Prediction."IEEE Open Journal of the Communications Society 1 (2020): 320-332.

[5] Poncha LJ, Abdelhamid S, Alturjman S, Ever E, Al-Turjman F. "5G in aconvergent Internet of Things era: An overview." In 2018 IEEE Inter-national Conference on Communications Workshops (ICC Workshops),pp. 1-6. IEEE, 2018.

[6] Al-Turjman F, Lemayian JP. "Intelligence, security, and vehicular sensornetworks in internet of things (IoT)-enabled smart-cities: An overview."Computers Electrical Engineering 87 (2020): 106776.

[7] Lemayian JP, Al-Turjman F. "Intelligent IoT communication in smartenvironments: An overview." In Artificial Intelligence in IoT, pp. 207-221.Springer, Cham, 2019.

[8] Al-Turjman F, Lemayian JP, Alturjman S, Mostarda L. "Optimal Place-ment for 5G Drone-BS Using SA and GA." In Drones in IoT-enabledSpaces, pp. 43-58. CRC Press, 2019.

[9] Al-Turjman F, Lemayian JP, Alturjman S, Mostarda L "Enhanced deploy-ment strategy for the 5G drone-BS using artificial intelligence." IEEEAccess 7 (2019): 75999-76008.

[10] LEMAYIAN JP, HAMAMREH JM. "Autonomous First Response Drone-Based Smart Rescue System for Critical Situation Management in FutureWireless Networks." (2020).

[11] Lemayian JP, Hamamreh JM. "First Responder Drones for Critical Sit-uation Management." In 2019 Innovations in Intelligent Systems andApplications Conference (ASYU), pp. 1-6. IEEE, 2019.

[12] Hamamreh JM, Furqan HM, Ali Z, Sidhu GA. "Enhancing the securityperformance of OSTBC using pre-equalicodization." In 2017 Interna-tional Conference on Frontiers of Information Technology (FIT), pp.294-298. IEEE, 2017.

[13] Love DJ, Heath RW, Lau VK, Gesbert D, Rao BD, Andrews M. "Anoverview of limited feedback in wireless communication systems." IEEEJournal on selected areas in Communications 26, no. 8 (2008): 1341-1365.

[14] Hamamreh JM, El_sallabi H, Abdallah M, Qaraqe K. "Advance in Adap-tive Modulation for Fading Channels." (2013).

[15] Furqan HM, Hamamreh JM, Arslan H. "Secret key generation usingchannel quantization with SVD for reciprocal MIMO channels." In2016 International Symposium on Wireless Communication Systems(ISWCS), pp. 597-602. IEEE, 2016.

[16] Hamamreh JM, Hajar A, Abewa M. "Orthogonal frequency divisionmultiplexing with subcarrier power modulation for doubling the spec-tral efficiency of 6G and beyond networks." Transactions on EmergingTelecommunications Technologies 31, no. 4 (2020): e3921.

[17] Jaradat AM, Hamamreh JM, Arslan H. "OFDM With Hybrid Number andIndex Modulation." IEEE Access 8 (2020): 55042-55053.

[18] Wang J, Ding Y, Bian S, Peng Y, Liu M, Gui G. "UL-CSI data driven deeplearning for predicting DL-CSI in cellular FDD systems." IEEE Access 7(2019): 96105-96112.

[19] Zhang R, Isola P, Efros AA. "Split-brain autoencoders: Unsupervisedlearning by cross-channel prediction." In Proceedings of the IEEE Con-ference on Computer Vision and Pattern Recognition, pp. 1058-1067.2017.

[20] Yaghjian A. "Approximate formulas for the far field and gain of open-ended rectangular waveguide." IEEE Transactions on Antennas andPropagation 32, no. 4 (1984): 378-384.

[21] Yaghjian A. "An overview of near-field antenna measurements." IEEETransactions on antennas and propagation 34, no. 1 (1986): 30-45.

[22] Yang Y, Gao F, Li GY, Jian M. "Deep Learning-Based Downlink ChannelPrediction for FDD Massive MIMO System." IEEE Communications Let-ters 23, no. 11 (2019): 1994-1998.

[23] Liao RF, Wen H, Wu J, Song H, Pan F, Dong L. "The Rayleigh fadingchannel prediction via deep learning." Wireless Communications andMobile Computing 2018 (2018).

[24] Peng W, Zou M, Jiang T. "Channel prediction in time-varying massiveMIMO environments." IEEE Access 5 (2017): 23938-23946.

[25] Zhou S, Giannakis GB. "How accurate channel prediction needs to be fortransmit-beamforming with adaptive modulation over Rayleigh MIMOchannels?." IEEE Transactions on Wireless Communications 3, no. 4(2004): 1285-1294.

[26] Oien GE, Holm H, Hole KJ. "Impact of channel prediction on adaptivecoded modulation performance in Rayleigh fading." IEEE Transactionson Vehicular Technology 53, no. 3 (2004): 758-769.

[27] Wong IC, Evans BL. Evans. "Joint channel estimation and prediction forOFDM systems." In GLOBECOM’05. IEEE Global TelecommunicationsConference, 2005., vol. 4, pp. 5-pp. IEEE, 2005.

[28] Adeogun RO, Teal PD, Dmochowski PA. "Parametric channel predictionfor narrowband mobile MIMO systems using spatio-temporal correla-tion analysis." In 2013 IEEE 78th Vehicular Technology Conference (VTCFall), pp. 1-5. IEEE, 2013.

[29] Huang L, Long T, Mao E, So HC. "MMSE-based MDL method for accu-rate source number estimation." IEEE Signal Processing Letters 16, no. 9(2009): 798-801.

JOEL P. LEMAYIAN received the B.Sc. degreein electrical and electronics engineering fromMiddle East Technical University Turkey, in2017. He is presently pursuing the master’s(M.Sc.) degree in electrical and computer en-gineering. He is currently with Antalya BilimUniversity, Turkey.

He has worked as a research assistant in bothMiddle East Technical University and AntalyaBilim University in IoT lab and Neuroscience

lab respectively. He is an author of numerus journals, conferencepapers and book chapters. His research interests include UAVs, 5GCommunication networks, Artificial Intelligence, Machine Learning, andthe Internet of Things (IoT) applications. 9


JEHAD M. HAMAMREH received the B.Sc.degree in electrical and telecommunication en-gineering from An-Najah University, Nablus,in 2013, and the Ph.D. degree in electrical-electronics engineering and cyber systems fromIstanbul Medipol University, Turkey, in 2018.He was a Researcher with the Department ofElectrical and Computer Engineering, Texas Aand M University at Qatar. He is currentlyan Assistant Professor with the Electrical and

Electronics Engineering Department, Antalya International (Bilim) Uni-versity, Turkey.

His current research interests include wireless physical and MAClayers security, orthogonal frequency-division multiplexing multiple-input multiple-output systems, advanced waveforms design, multi-dimensional modulation techniques, IoT, 5G & 6G and orthogonal/non-orthogonal multiple access schemes for future wireless systems. He is aRegular Reviewer for various refereed journals as well as a TPC Memberfor several international conferences.

10

massive mimo channel prediction using recurrent neural

Documents