paper a design of low latency random access preamble...

8
IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 2013 1089 PAPER A Design of Low Latency Random Access Preamble Detector for LTE Uplink Receiver Joohyun LEE a) , Member, Bontae KOO , Nonmember, and Hyuckjae LEE †† , Member SUMMARY This paper presents a hardware design of high throughput, low latency preamble detector for 3GPP LTE physical random access chan- nel (PRACH) receiver. The presented PRACH receiver uses the pipelined structure to improve the throughput of power delay profile (PDP) gener- ation which is executed multiple times during the preamble detection. In addition, to reduce detection latency, we propose an instantaneous pream- ble detection method for both restricted and unrestricted set. The proposed preamble detection method can detect all existing preambles directly and instantaneously from PDP output while conducting PDP combining for re- stricted set. The PDP combining enables the PRACH receiver to detect preambles robustly even in severe Doppler eect or frequency error exist. Using proposed method, the worst case preamble detection latency time can be less than 1 ms with 136 MHz clock and the proposed PRACH receiver can be implemented with approximately 237k equivalent ASIC gates count or occupying 30.2% of xc6vlx130t FPGA device. key words: LTE, uplink, random access, PRACH, latency, throughput 1. Introduction The long term evolution (LTE) is a promising next gen- eration communication technology and has been standard- ized by 3GPP international organization. In uplink side of LTE system, there are three physical channels are available. These are physical uplink shared channel (PUSCH), control channel (PUCCH), and random-access channel (PRACH) [1]. The PUSCH is mainly used for transferring data traf- fic and PUCCH is used for control data like channel quality index (CQI) and hybrid automatic repeat request (HARQ). The PRACH is used for random access procedure which is needed for initial connection, uplink re-synchronization and hand-over procedure and so on. If user equipment (UE) needs to start random ac- cess procedure, UE selects a preamble among 64 available preambles and send that preamble to base-station (BS) as a request for initiating random access procedure. And then, BS detects preamble and measures round trip delay time by decoding the specified time-frequency resources (PRACH) of uplink signal. After successful detection of preamble, BS send random access response (RAR) message to the de- Manuscript received July 23, 2012. Manuscript revised December 17, 2012. The authors are with Mobile Comm. and Broadcasting Con- vergence SoC Research Team, ETRI, Korea. †† The author is with Electrical Engineering Department, Korea Advanced Institute of Science and Technology (KAIST), Korea. This work was supported by the Ministry of Knowledge Econ- omy of Korea under the title of “3G LTE based All-In-One Femto- Cell Base-Station SoC Platform”. a) E-mail: [email protected] DOI: 10.1587/transcom.E96.B.1089 tected UE through downlink connection and the UE com- municates with BS after adjusting uplink timing [2]. Several previous researches have evaluated the de- tection performance of several dierent PRACH detection strategies [4]. The researches of [5]–[7] try to reduce the computational complexity of PRACH receiving. In this research, main subjects are the preamble detec- tion latency time and the preamble detection performance in Doppler or frequency error existing environment. Reducing preamble detection time can help improve network latency performance. And robust detection in Doppler or frequency error existing environment is crucial for high-speed cell per- formance. For this aspect, we present a hardware design of low latency PRACH receiver which can detect preamble ro- bustly in Doppler or frequency error existing environment. This paper organized as following. In Sect. 2, we briefly introduce the PRACH of LTE system, and in Sect.3, we describe the proposed preamble detection method with analytic formulation. In Sect. 4, we explain the hardware design of proposed low latency PRACH receiver and, in Sect. 5, we explain the implementation details and results of proposed PRACH receiver and in Sect. 6, we make con- clusions. 2. LTE PRACH Overview In this section, we briefly introduce the physical random ac- cess channel (PRACH) of 3GPP LTE system. The random access preambles in LTE system are gen- erated using Zado-Chu (ZC) sequence and the u-root ZC sequence, x u (n), is defined as following [8] x u (n) = exp (jπun(n + 1)/N ZC ) , 0 n N ZC 1 (1) where N ZC =839 is prime numbered sequence length, u = 1, 2, ..., 838 is root index of ZC sequence. The ZC sequence is a constant amplitude zero auto correlation (CAZAC) se- quence and it has ideal cyclic autocorrelation properties [8]. Therefore, from a ZC sequence, multiple orthogonal se- quences can be generated using simple cyclic shifting oper- ation and these sequences are used as random access pream- ble. The random access preamble sequence is defined as following. x u,v (n) = x u ( (n + C v ) mod N ZC ) (2) where v represents preamble identification number (PID), C v denotes cyclic shift for each preambles. Copyright c 2013 The Institute of Electronics, Information and Communication Engineers

Upload: dinhxuyen

Post on 20-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 20131089

PAPER

A Design of Low Latency Random Access Preamble Detector forLTE Uplink Receiver∗

Joohyun LEE†a), Member, Bontae KOO†, Nonmember, and Hyuckjae LEE††, Member

SUMMARY This paper presents a hardware design of high throughput,low latency preamble detector for 3GPP LTE physical random access chan-nel (PRACH) receiver. The presented PRACH receiver uses the pipelinedstructure to improve the throughput of power delay profile (PDP) gener-ation which is executed multiple times during the preamble detection. Inaddition, to reduce detection latency, we propose an instantaneous pream-ble detection method for both restricted and unrestricted set. The proposedpreamble detection method can detect all existing preambles directly andinstantaneously from PDP output while conducting PDP combining for re-stricted set. The PDP combining enables the PRACH receiver to detectpreambles robustly even in severe Doppler effect or frequency error exist.Using proposed method, the worst case preamble detection latency time canbe less than 1 ms with 136 MHz clock and the proposed PRACH receivercan be implemented with approximately 237k equivalent ASIC gates countor occupying 30.2% of xc6vlx130t FPGA device.key words: LTE, uplink, random access, PRACH, latency, throughput

1. Introduction

The long term evolution (LTE) is a promising next gen-eration communication technology and has been standard-ized by 3GPP international organization. In uplink side ofLTE system, there are three physical channels are available.These are physical uplink shared channel (PUSCH), controlchannel (PUCCH), and random-access channel (PRACH)[1]. The PUSCH is mainly used for transferring data traf-fic and PUCCH is used for control data like channel qualityindex (CQI) and hybrid automatic repeat request (HARQ).The PRACH is used for random access procedure which isneeded for initial connection, uplink re-synchronization andhand-over procedure and so on.

If user equipment (UE) needs to start random ac-cess procedure, UE selects a preamble among 64 availablepreambles and send that preamble to base-station (BS) as arequest for initiating random access procedure. And then,BS detects preamble and measures round trip delay time bydecoding the specified time-frequency resources (PRACH)of uplink signal. After successful detection of preamble,BS send random access response (RAR) message to the de-

Manuscript received July 23, 2012.Manuscript revised December 17, 2012.†The authors are with Mobile Comm. and Broadcasting Con-

vergence SoC Research Team, ETRI, Korea.††The author is with Electrical Engineering Department, Korea

Advanced Institute of Science and Technology (KAIST), Korea.∗This work was supported by the Ministry of Knowledge Econ-

omy of Korea under the title of “3G LTE based All-In-One Femto-Cell Base-Station SoC Platform”.

a) E-mail: [email protected]: 10.1587/transcom.E96.B.1089

tected UE through downlink connection and the UE com-municates with BS after adjusting uplink timing [2].

Several previous researches have evaluated the de-tection performance of several different PRACH detectionstrategies [4]. The researches of [5]–[7] try to reduce thecomputational complexity of PRACH receiving.

In this research, main subjects are the preamble detec-tion latency time and the preamble detection performance inDoppler or frequency error existing environment. Reducingpreamble detection time can help improve network latencyperformance. And robust detection in Doppler or frequencyerror existing environment is crucial for high-speed cell per-formance. For this aspect, we present a hardware design oflow latency PRACH receiver which can detect preamble ro-bustly in Doppler or frequency error existing environment.

This paper organized as following. In Sect. 2, webriefly introduce the PRACH of LTE system, and in Sect. 3,we describe the proposed preamble detection method withanalytic formulation. In Sect. 4, we explain the hardwaredesign of proposed low latency PRACH receiver and, inSect. 5, we explain the implementation details and resultsof proposed PRACH receiver and in Sect. 6, we make con-clusions.

2. LTE PRACH Overview

In this section, we briefly introduce the physical random ac-cess channel (PRACH) of 3GPP LTE system.

The random access preambles in LTE system are gen-erated using Zadoff-Chu (ZC) sequence and the u-root ZCsequence, xu(n), is defined as following [8]

xu(n) = exp (− jπun(n + 1)/NZC) , 0 ≤ n ≤ NZC − 1 (1)

where NZC=839 is prime numbered sequence length, u =1, 2, ..., 838 is root index of ZC sequence. The ZC sequenceis a constant amplitude zero auto correlation (CAZAC) se-quence and it has ideal cyclic autocorrelation properties [8].Therefore, from a ZC sequence, multiple orthogonal se-quences can be generated using simple cyclic shifting oper-ation and these sequences are used as random access pream-ble. The random access preamble sequence is defined asfollowing.

xu,v(n) = xu((n + Cv) mod NZC

)(2)

where v represents preamble identification number (PID), Cvdenotes cyclic shift for each preambles.

Copyright c© 2013 The Institute of Electronics, Information and Communication Engineers

Page 2: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

1090IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 2013

For each preambles (total 64 preambles are exist in acell), a cyclic shift region (the length of a cyclic shift regionis NCS ) is assigned. And, cyclic shift values (Cv) for eachpreambles are defined as

Cv = v · NCS v = 0, 1, ..., �NZC/NCS � − 1 (3)

If all 64 Cv values cannot be generated from single ZC se-quence, change the root index (u) of ZC sequence to nextvalue and continue Cv value generation. In this case, mul-tiple ZC sequences which have different root index, areneeded for 64 preambles.

However, (3) is for the case of not considering theDoppler effect or frequency error. LTE standard calls thesepreambles as “unrestricted set”. The other set of preamblesin LTE standard is called “restricted set”. The restricted setis designed for high speed cell and considering the Dopplereffect or frequency error. In restricted set, some cyclic shiftregions are prohibited to avoid ambiguity during preambledetection (we will explain the reason in Sect. 3) and thecyclic shift values (Cv) are defined as following [1].

Cv = dstart�v/nRAshi f t� + (v mod nRA

shi f t)NCS (4)

v = 0, 1, ..., nRAshi f tn

RAgroup + n̄RA

shi f t − 1

for NCS ≤ du < NZC/3 (5)

nRAshi f t = �du/NCS �

dstart = 2du + nRAshi f tNCS

nRAgroup = �NZC/dstart�

n̄RAshi f t = max

(�(NZC − 2du − nRA

groupdstart)/NCS �, 0)

for NZC/3 ≤ du ≤ (NZC − NCS )/2 (6)

nRAshi f t = �(NZC − 2du)/NCS �

dstart = NZC − 2du + nRAshi f tNCS

nRAgroup = �du/dstart�

n̄RAshi f t = min

(max(�(du − nRA

groupdstart)/NCS �, 0), nRAshi f t

)where the du value is defined as following and u−1 is multi-plicative inverse of u mod NZC .

du =

{u−1 0 ≤ u−1 < NZC/2NZC − u−1 otherwise

(7)

After selecting preamble ID (by deciding Cv), thepreamble is transmitted using SC-FDMA modulation pro-cedure as shown in Fig. 1.

3. Analytic Modeling of Proposed PRACH Receiver

In this section, we explain the preamble detection flow ofproposed PRACH receiver using analytic formulation.

The preamble detection is accomplished using powerdelay profile (PDP) and the preamble transmission and de-tection flow is shown in Fig. 1. At first, let assume UE trans-mits preamble of Cv=0 (PID=0) and ignore round trip delay(TRT D=0) for now. Then the frequency domain received sig-nal Ru(k) can be represented as following.

Fig. 1 The preamble detection flow of the proposed PRACH receiver.

Fig. 2 The inter carrier interference due to frequency error.

Ru(k) = H(k)Xu(k)D(ε) + I(k, ε) +W(k) (8)

D(ε) = sin(πε)N sin(πε/N) �

sin(πε)πε

e jπε(N−1)/N

= sinc(ε) · e jπε(N−1)/N

I(k, ε)=∑

p∈Qp�k

Xu(k)H(k) sin πεN sin(π(p−k+ε)/N)

· e jπε(N−1)/N · e− jπ(p−k)/N

where H(k), Xu(k) is frequency domain representation ofchannel (h) and ZC sequence (xu(n)) respectively. W(k)means additive white Gaussian noise, ε = ferror/Δ fsc is nor-malized frequency error and Δ fsc=1.25 kHz is subcarrierspacing of PRACH. Q is a set of subcarrier index whichcontains PRACH frequency resources.

If ε�0, k-th subcarrier of Ru(k) is distorted itself byD(ε) as shown in Figs. 2 and (8). And it is affected by I(k, ε)from other subcarriers, a.k.a inter-carrier-interference (ICI).

For brevity, let assume h=1, no noise, and only con-sider dominant ICI from adjacent two subcarriers. And then,the (8) can be represented as following.

Ru(k)= Xu(k)D(ε) + I(k, ε)= I−1(ε)Xu(k−1)+D(ε)Xu(k)+I+1(ε)Xu(k+1)

(9)

I±1(ε) =sin πε

N sin(π(ε ± 1)/N)· e jπε(N−1)/N · e∓ jπ/N

To calculate delay profile, Ru(k) is multiplied by conju-gate version of original ZC sequence in frequency domain.

Ru(k)X∗u(k) = I−1(ε)Xu(k−1)X∗u(k)+D(ε)Xu(k)X∗u(k)+I+1(ε)Xu(k + 1)X∗u(k)

(10)

Using following property [9] and substitute (11) into (10).

Xu(k) = DFT [xu(n)] = x∗u(u−1k) · Xu(0) (11)

Then the delay profile output, τu(n), is represented as (12)

Page 3: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

LEE et al.: A DESIGN OF LOW LATENCY RANDOM ACCESS1091

and the power delay profile (PDP) is calculated as |τu(n)|.The preamble detector uses the PDP data (|τu(n)|) for pream-ble detection.

τu(n)= IDFT[Ru(k) · X∗u(k)

]

= IDFT

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣I−1(ε) · exp

[j2π u−1k/N

]·C−1

+D(ε)+I+1(ε) · exp

[− j2π u−1k/N

]· C+1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

C−1 = e jπ(1−u−1)/N ,C+1 = e− jπ(1+u−1)/N

(12)

From (12), if ε=0, only D(ε) term contributes to τu(n)therefore, single impulse (signature) will be appeared atzero delay position of |τu(n)|. But, if ε�0, two other im-pulses (fake signatures) are also appeared both sides of gen-uine signature with u−1 distance away as shown in Fig. 3(a).Because of these fake signature, the cyclic shift regions(length=NCS ) in which fake signatures appear, are restrictedfor preamble transmission to avoid ambiguity during pream-ble detection. Due to this restriction, Cv values for restrictedset are defined as shown in (4).

If we consider arbitrary Cv value (arbitrary PID), thegenuine signature are move to S v position on delay profileoutput (|τu(n)|) because of transmitter’s cyclic shift Cv value.And if we also consider practical multi-path channel anduplink timing error due to round trip delay time (TRT D) be-tween UE and BS, the |τu(n)| becomes the shape of Fig. 3(b).Due to multi-path channel, the signature becomes not an im-pulse but a shape of delay spared and the round trip delayappears as some delay (TRT D) on delay profile output.

The amplitude of signatures (D(ε), I±1(ε)) are varyingaccording to frequency error value (ε) as shown in Fig. 3and Fig. 2. As ε increases, the peak of genuine signature de-creases and in other hand, fake signatures increase. There-fore, if check only one NCS cyclic shift region of genuinesignature for detecting preamble, the detection performancewill be degraded due to reduced peak of genuine signature.And even can’t detect preamble if ε1.0 and line-of-sight(LOS) environment because the genuine signature will becompletely disappeared and there exist only fake signatureon delay profile output as shown in Fig. 3(c). To overcomethese problem, the proposed preamble detector using the

Fig. 3 The delay profile output (|τu(n)|). (a): ε0.25, h=1,no noise,TRT D=0, Cv=0. (b): ε0.5, multi-path channel, arbitrary Cv, TRT D � 0.(c): ε1.0, Line-of-sight (LOS), arbitrary Cv, TRT D�0.

PDP combining method as shown in Fig. 3(d). The PDPcombining is conducted as following.

τcomb,v(n) = |τu((S v + n − du) mod NZC)|+ |τu((S v + n) mod NZC)|+ |τu((S v + n + du) mod NZC)|

(13)

where, n=0,...,NZC − 1 and v=0,...,63. After PDP combiningfor each v, find signature using threshold value. Once sig-nature is found (peak value is larger than threshold), decidethe v value as PID and measure round trip delay using thepeak value location in a NCS cyclic shift region.

4. Hardware Design of Proposed PRACH Receiver

The block diagram of proposed PRACH receiver is shownin Fig. 4. At first, time domain frequency shifter movePRACH frequency resources to f=0 Hz position so thatanti-aliasing (low pass) filter extract the PRACH resources(bandwidth=1.08 MHz) before decimation. Decimation isessential to avoid huge 24576-pt FFT and with decimationfactor of M = 12, the huge FFT can be reduced to 2048-ptFFT. After OFDM demodulation through the FFT, receivedfrequency domain preamble (Ru(k)) is multiplied by con-jugated version of original sequence (X∗u(k)) and resultantsequence goes to inverse FFT (IFFT) block to get power de-lay profile (PDP). The PDP output data feed to proposedpreamble detector as shown in Fig. 4 and the detailed blockdiagram of proposed preamble detector is shown in Fig. 5.

In Fig. 5, the PDP data (IFFT output) is accompa-nied by its index output n=0,1,2,...,NFFT − 1. However,PRACH related values (Cv, NCS , du) are designed in termsof i=0,1,2,...,NZC − 1 range. Therefore, n index converted toi index value using following equation.

i = n · NZC/NFFT (14)

In proposed method, the S v values for each preamblesare prepared before PDP output data come in. The S v val-ues represent the starting i index values of a cyclic shift re-gion on PDP output which correspond to each Cv values oftransmitter. S v values can be generated using (3), (4) andfollowing equation.

Fig. 4 The PRACH receiver block diagram.

Page 4: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

1092IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 2013

Fig. 5 The proposed preamble detector block diagram.

S v = (NZC − Cv) mod NZC (15)

However, we propose more efficient method to gener-ate S v values. The proposed method uses simple arithmeticoperation instead of divider, multiplier and nested calcula-tion of (4). The proposed method can be implemented withsimple finite-state-machine (FSM) and efficient for hard-ware implementation.

The S v values are generated by “Sequential Sv Gen”block of Fig. 5. And the algorithm and its graphical exam-ple is shown in Fig. 6 and Fig. 7 respectively. In Fig. 6, vari-ables are initialized differently for unrestricted set and both(5) and (6) cases of restricted set. And then, S v values aregenerated sequentially.

For unrestricted set case of Fig. 7(a), the cyclic shift(NCS ) regions of each preambles are located consecutivelyexcept “unused” region between the first and last NCS re-gion. The “unused” region is the remaining region after all64 NCS regions are assigned or short region which is notlarge enough for one NCS region.

For restricted set, we define “segment” as some PDPregion in which S v values are located consecutively asshown in Figs. 7(b), (c) and the variable segLen means thelength of a segment. Figure 7(b) shows an example of re-stricted set (u=218, NCS=38, du=127) which correspondto the case of (5). And segments A and B are the regionsfor genuine signatures and A+, A−, B+, B− segments are forfake signatures of A and B due to Doppler or frequency er-ror. For the case of (5), we set the length of a segment assegLen=du=127. The first NCS region for PID=0 (S v=0) islocated at i=0 and next NCS region (S v=1) is located consec-utively to the left (cyclically) so that appeared at the rightmost position of PDP output. In segment A, up to three NCS

regions (S v=0,1,2) can be located because A− region is alreadyexist at the next position. To avoid conflict, the NCS regionfor S v=3 shall be located dist2Nxt=2·du distance away fromS v=2 so that avoid A− and B+ segment regions.

Figure 7(c) shows S v locations in case of u=707,du=375, NCS=26 which fulfills condition (6). The NCS re-gions (i.e. S v values) are similarly located except that the

// Variable InitializationnSv=0; Sv=Nzc

if(isRestricted){

du = table(u); // calculated from (7)if(du < 280){ // check condition (5)segLen=du; dist2Nxt=2*segLen;

endIdx=Ncs+segLen; dist2End=segLen;

} else {

segLen=Nzc-2*du; dist2Nxt=segLen;

endIdx=Ncs+Nzc-du; dist2End=0;

}

ulimit=Sv+Ncs-1; llimit=ulimit-segLen+1;

} else {

du=segLen=dist2Nxt=ulimit=llimit=0;

endIdx=Ncs; dist2End=0;

}

-------------------------------------------------------

// Sequential Sv Generationwhile (nSv<64 &&

Sv-dist2End>=endIdx) {

if(isRestrictedSet){ // Resetricted setbool isSvValid=

Sv+Ncs-1<=ulimit && Sv>=llimit ? 1: 0

if(isSvValid) Sv[nSv++]=mod(Sv,Nzc);

if(isSvValid==0){

Sv -= dist2Nxt;

ulimit=Sv+Ncs-1; llimit=ulimit-segLen+1;

} else {

Sv -= Ncs;

}

} else { // UnResetricted setbool isSvValid=1;

Sv[nSv++]=mod(Sv,Nzc);

Sv -= Ncs;

}

}

Fig. 6 The proposed sequential S v generation algorithm.

Fig. 7 Graphical analysis of Sv generation with some examples.

length of a segment set to segLen=NZC-2·du and du is rela-tively larger than the case of (b).

The variable endIdx means the lower boundary of validS v values on PDP output and dist2End is the minimum dis-

Page 5: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

LEE et al.: A DESIGN OF LOW LATENCY RANDOM ACCESS1093

Fig. 8 The nS v distribution for each NCS of restricted set.

tance from S v to endIdx for one more NCS region fit into.ulimit and llimit are the upper and lower limit of a segment.After end of S v generation, the variable nS v shall containsthe number of preambles contained in a PDP output.

There is 838 values of u and 16 values of NCS . TheS v values are generated differently for each combinationsof u and NCS values. Therefore, preamble detector shouldgenerate S v values whenever the u value is changed duringpreamble detection.

The generated S v values are saved to “SvReg” block ofFig. 5. We need only 18 registers (not 64) to save S v valuesbecause the maximum value of nS v is only 18 in restrictedset as shown in Fig. 8. Figure 8 shows the nS v distributionover all root indexes for each NCS value of restricted set.For example, in case of NCS=15, total 796 root indexes areavailable (fulfills (5), (6)) and from 9 to 18 S v values canbe generated from a root index. Therefore, 18 registers areenough for “SvReg” block to save all generated S v values.For unrestricted set, nS v = �NZC/NCS � and maximum valueof nS v is 64. However, in proposed method, only two S vvalues (the first and the last) are saved to “SvReg” in unre-stricted set. From the first and last S v values, find “unused”region and the other NCS regions can simply be found bycounting the PDP output as shown in Fig. 7(a).

The “SvReg” block generate S v values as well as itsfake signature regions S +v , S −v as following.

S v, v = 0, 1, ..., 17S +v = (S v + du) mod NZC , v = 0, 1, ..., 17S −v = (S v − du) mod NZC , v = 0, 1, ..., 17

(16)

All of generated S v, S +v , S −v values is compared simultane-ously with the i index value at “Matching” block to find thestarting position of a NCS region. Therefore, the “Match-ing” block consists of total 3 · 18 = 54 constant compara-tors. Registers in “SvReg” block are initialized with a valuegreater than NZC and the generated S v values are overwrit-ten to registers. Therefore, among outputs of “SvReg”,S v≥nS v , S

+v≥nS v, S −v≥nS v

values will never be matched with i be-cause the initial value is always greater than i. Therefore,“Matching” block only compares the generated and savedS v values (v < nS v).

If any one of three S v, S +v , S −v values of a v valueis matched with i, “Matching” block output the matched v

value to “opCode” block as vmatched as shown in Fig. 7(c).“opCode” block has 18 opcode registers to indicate “write”,“accum”, “output” operating code for combining operationas shown in following.

opcode[v] = {“write”, “accum”, “output”}v = 0, 1, 2, ..., 17

All opcode[v] is initialized to “write” and whenever vmatched

is asserted, the corresponding opcode (opcode[vmatched]) issequentially changed to “accum” and “output” as shownin Fig. 7 (‘w’≡write, ‘a’≡accum, ‘o’≡output). And theopcode[vmatched] goes to the PDP combining logic as “cmd”and control the PDP combining operation. The PDP com-bining logic sums the NCS region of genuine signature andits two fake signature regions. For this, when the first NCS

region among three regions (starting from S v, S +v , S−v ) comes

in from IFFT, the “cmd” will be set to “write” and the PDPdata are written to “combMem” block directly. For sec-ond NCS region, “cmd” set to “accum” so that PDP datawill be accumulated with “combMem” output (accumulatedwith first NCS region) and write-back to “combMem” within-place manner. At the time of third NCS region comes in,“cmd” set to “output” and third region is accumulated with“combMem” output and feed to “peakSearch” block. To ac-complish these combining operation, the address (addr) of“combMem” block is generated as following.

addr = NCS ,NFFT · vmatched + p (17)

p = 0, 1, ...,NCS ,NFFT − 1

NCS ,NFFT = �NCS · NFFT/NZC�where p value is a counting value synchronized to n indexand has range from 0 to NCS ,NFFT − 1. With above addressgeneration, the three NCS regions which belong to same vvalue are mapped same memory area. Therefore, the PDPcombining operation can be accomplished instantaneously.

The minimum memory depth required for “combMem”can be derived as following.

depthmin = �max(NCS · nS v) · NFFT/NZC (18)

The maximum value of NCS ·nS v = 276 is occur at NCS = 46and nS v = 6 therefore, minimum required memory depth isdepthmin = 674.

For unrestricted set, only two S v values (the first S v=0,and the last S v=nS v−1) are saved into “SvReg” block. Aftersecond match (S v=nS v−1), “Matching” block will count thetransition of i value and repeatedly generate vmatched at everyNCS transition of i value. The opcode is always set to “out-put” and PDP combining is not used in unrestricted set andPDP data directly feed through without accumulation.

“peakSearch” block check the NCS regions when“cmd”=“output” and find peak which is larger than prede-fined threshold value as shown in Fig. 5. If valid peak isfound, measure round trip delay time between UE and BSusing peak position in a NCS region. And PID value is de-cided by

Page 6: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

1094IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 2013

PID = vmatched + nPid (19)

where nPid is total number of PIDs which is already pro-cessed before current PDP output as shown in Fig. 7(c).

Using proposed method, the whole preamble detectionprocedure is completed immediately after PDP output datais arrived. Therefore the proposed method can minimize thepreamble detection latency of PRACH receiver.

5. Implementation and Performance of PRACH Re-ceiver

The proposed PRACH receiver is implemented with hard-ware to investigate its feasibility and hardware resource us-age. The structure of PRACH receiver is already shownin Fig. 4. The implemented PRACH receiver consists of 4pipelined stages as shown in Fig. 4 and Fig. 9.

The stage0 is executed only once for each PRACH sub-frame and it consists of time domain frequency shifter, dec-imation filter and FFT processor. A 12 bit ADC is used anda 35-tap FIR filter is used for decimation filter. The filteris implemented efficiently using sub-expression eliminationmethod [11]. The FFT/IFFT processor is 2048-pt pipelinedFFT/IFFT processor so that it can handle back-to-back con-tinuous input data. As a results of stage0, the received fre-quency domain preamble sequence is stored into buffer asshown in Fig. 4.

In stage1, the next valid root index (u) is searched be-fore IFFT input data generation, Finding valid root indexmeans checking the condition (5) and (6) while changinglogical root index as defined in LTE standard [1].

The IFFT input data generation in stage2 is consists offrequency domain ZC sequence generation (length=NZC),buffer controlling to read out received ZC sequence, zero-padding (zero length=NFFT −NZC) and complex multiplica-tion. During stage2, frequency domain ZC sequence gener-ation block is implemented efficiently using [9].

The stage3 is consists of pipelined IFFT processor andpreamble detector. The preamble detector can detect pream-bles from PDP data without any delay therefore, it enablesIFFT processor to generate multiple PDP data back-to-backcontinuously so that maximize PDP generation throughput.

The stage1, 2 and 3 may be repeated multiple times toaccomplish preamble detection for each PRACH subframe.

To increase throughput and reduce the detection la-tency, the implemented PRACH receiver uses two differ-

Fig. 9 The proposed PRACH receiver pipeline timing diagram.

ent clock domains. The stage0 uses sample rate clockCLK = 30.72 MHz and stage1, 2 and 3 use higher clockfrequency CLKX. The detection latency (Nlatency) of imple-mented PRACH receiver can be calculated as following.

Nlatency = NFFTdelay + NFFT · nRoot (20)

where Nlatency is the number of clock cycle from the end ofPRACH input subframe to the end of preamble detection asshown in Fig. 9 and nRoot denotes total number of root in-dexes which is necessary to generate all 64 preambles. Asshown in above equation, the latency depends on nRoot andthe worst case of nRoot value is 64 as shown in Fig. 10. Fig-ure 10 shows nRoot distribution over valid root indexes foreach NCS values of restricted and unrestricted set. For exam-ple of restricted set, if NCS=128, total 35–64 roots, among456 valid roots, are needed to generates 64 preambles. Forexample of unrestricted set, all root indexes are valid and forNCS=93, total 8 roots are needed to generate 64 preambles.

To compare detection latency performance, we con-sider digital signal processor (DSP) based preamble detec-tion method. In DSP based method, we assume DSP controla hardware accelerator to generate the PDP data of Fig. 4and the PDP data are stored to a buffer memory. And then,DSP accesses the buffer memory data and processes pream-ble detection. For simplicity, we assume the DSP operatesmuch higher clock frequency than buffer memory and DSPinternal execution time is very small so that we consider thebuffer memory access time and PDP generation time as thepreamble detection latency. The detection latency of DSPbased method can be calculated as following.

Nlatency.DS P = NPDPcycle · NCS ,NFFT · 64

+ nRoot · (NFFTdelay + NFFT ) (21)

where, NPDPcycle=3 for restricted set which represents 3samples access time for PDP combining and NPDPcycle=1for unrestricted set. The detection latency for proposedmethod and DSP based method is calculated and shown inFig. 11. For each NCS values, the detection latency of pro-posed method is approximately 2.7–3.8 times shorter thanDSP based method.

The worst case latency delay of proposed method isoccurred when nRoot=64 and in this case, stage1, 2 and

Fig. 10 The nRoot distribution for each NCS of restricted set.

Page 7: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

LEE et al.: A DESIGN OF LOW LATENCY RANDOM ACCESS1095

Fig. 11 The detection latency of proposed method and DSP basedmethod (NFFTdelay=4240, NFFT=2048).

Table 1 Detection latency of implemented PRACH receiver.

CLKX (MHz) 30.72 61.44 122.88 136 245.76Worst Latency (ms) 4.4 2.2 1.1 0.995 0.551

Table 2 Hardware resource utilization of implemented PRACH receiver.

BlockFPGA1 ASIC2

DescriptionSLICEs(F/Fs,LUTs) Eq.Gates

CPX 24( 46, 40) 458 CP removerTDFS 864(2113, 2964) 37948 time domain frequency

shifterDECIM 1539(3112, 5492) 50735 decimation filterControl 53( 54, 99) 759 find valid root index,

pipeline controlDFTZC 1467(3567, 4747) 52093 DFT

(xu,v=0(n)

), imple-

ment with [9]ZPC 67( 242, 142) 5579 zero-padding, complex

multiplierFFT/IFFT 1340(4060, 2287) 72849 pipelined 2048pt FFTPrmblDet 687(1583, 1867) 16555 proposed preamble de-

tectorTotal 6041(14777,17638) 236976 excluding memory1XC6VLX130T2SMIC 0.13 μm, 1 Eq.Gate=1 NAND gate with 2 inputs

3 iterate 64 times. The implemented PRACH receiver hasNFFTdelay=4240 and NFFT=2048. Therefore, the maximumlatency is given by Nlatency.max=4240+2048·64=135312. Asubframe in LTE system has 1 ms time duration and 30720samples (TS=1/30.72 MHz). Therefore, if stage1, 2 and 3uses CLKX=136 MHz (=135312/30720·30.72 MHz) clockfrequency then preamble detection can be done within a sub-frame. The worst case detection latency is summarized forseveral clock frequencies and shown in Table 1. The hard-ware resource utilization results of implemented PRACH re-ceiver is summarized in Table 2 and Table 3.

In addition, we simulate the miss detection ratio(MDR) using bit-accurate C-model of implemented PRACHreceiver and the results are shown in Fig. 12. During MDRsimulation, the following cases are defined as miss detec-tion [12]. The miss detection cases are 1.Detecting differentpreambles, 2.Not detecting transmitted preamble, 3.Detectpreamble but timing estimation error is larger than 1.04μsin AWGN or 2.08 μs in ETU70 channel. The MDR simu-

Table 3 Memory usage of implemented PRACH receiver.

Memory depth x width x ea DescriptionBUFFER 2048 x 16 x 2 save received preamble (Xrcv)

as shown in Fig. 4DUtable 838 x 10 x 1 save u and its du valueDCvalue 838 x 12 x 2 save DC values for each u for

calculating [9]combMem 674 x 16 x 1 memory for PDP combining

Fig. 12 The miss detection ratio (MDR) simulation of implementedPRACH receiver using bit-accurate C-model.

lation is performed on AWGN channel and ETU70 channelwith several frequency error values. The ETU70 channel isa multi-path (9path) fading channel with maximum Dopplerfrequency of 70 Hz due to UE’s mobility [13].

At first, examine the simulation results of AWGN chan-nel of Fig. 12. If there is no frequency error ( fe=0 Hz), MDRperformance of restricted set and unrestricted set are sameand there is no PDP combining benefit.

In case of fe=625 Hz (ε=0.5), both genuine and fakesignatures are appeared on PDP output as shown in Fig. 3(b).Because the peak of genuine signature is reduced due to fre-quency error (as mentioned in Sect. 3), the detection per-formance is degraded as shown in Fig. 12. But, using PDPcombining, detection performance is improved dramaticallyand preambles can be detected robustly even with highDoppler effect or frequency error.

If fe=1340 Hz (ε=1.07), the genuine signature is disap-peared and only fake signature exist (as shown in Fig. 3(c)).This phenomenon is occurred when frequency error is al-most same as subcarrier spacing ( fe = Δ fsc=1.25 kHz) orsevere Doppler effect in line-of-sight (LOS) environment.As shown in Fig. 12, the preamble cannot be detected unlessPDP combining.

For multi-path fading channel (ETU70) and frequencyerror, PDP combining is also helpful to improve preambledetection performance. Using PDP combining, the MDRperformance of ETU70 channel is also improved as shownin Fig. 12.

The LTE standard requires that the detection probabil-ity should be Pd > 99% for SNR levels listed in Table 4[12]. The Table 4 describes the minimum SNR require-ments to achieve 99% detection probability. The minimum

Page 8: PAPER A Design of Low Latency Random Access Preamble ...koasas.kaist.ac.kr/bitstream/10203/174035/1/000319301600001.pdf · A Design of Low Latency Random Access Preamble Detector

1096IEICE TRANS. COMMUN., VOL.E96–B, NO.5 MAY 2013

Table 4 The minimum SNR requirement to achieve Pd > 99%.

Condition Channel minimum SNR(Freq.Offset) for 99% detection

UnRestricted AWGN (0 Hz) −16.5ETU703 (270 Hz) −10.1

Restricted AWGN (0 Hz) −16.6ETU703 (270 Hz) −9.5AWGN (625 Hz) −14.4

AWGN (1340 Hz) −15.7

SNR values are also drawn in Fig. 12. In Fig. 12, all MDRresults of implemented PRACH receiver, using PDP com-bining, fulfills these requirements on AWGN channel andETU70 multi-path fading channel environment.

6. Conclusions

We propose a random access preamble detection method for3GPP LTE uplink system and we implement a PRACH re-ceiver that incorporates the proposed method. The imple-mented PRACH receiver can maximize the PDP generationthroughput using back-to-back continuous PDP generationand reduce preamble detection latency using instantaneouspreamble detection method. The proposed preamble detec-tor can detect all existing preambles directly and instanta-neously from IFFT output while conducting PDP combin-ing. The PDP combining is very effective for robust pream-ble detection when frequency error (or Doppler effect) exist-ing. The implemented PRACH receiver has less than 1 ms ofworst case detection latency time with 136 MHz clock. Andit can be implemented with occupying 30.2% SLICEs ofXC6VLX130T FPGA device or with 237k equivalent gatesusing 0.13 μm ASIC technology.

References

[1] 3GPP, TS 36.211, “Physical channels and modulation,” March 2009.[2] Y. Kishiyama, K. Higuchi, and M. Sawahashi, “Investigations on

physical random access channel structure in evolved UTRA uplink,”IEICE Trans. Commun., vol.E92-B, no.5, pp.1688–1694, May 2009.

[3] S. Sesia, I. Toufik, and M. Baker, LTE — The UMTS Long TermEvolution: From Theory to Practice, John Wiley & Sons, 2009.

[4] F.J. Lopez-Martinez, E. del Castillo-Sanchez, E. Martos-Naya, andJ.T. Entrambasaguas, “Performance evaluation of preamble detec-tors for 3GPP-LTE physical random access channel,” Digit. SignalProcess., vol.22, no.3, pp.526–534, May 2012.

[5] A. Freire-Irigoyen, R. Torrea-Duran, S. Pollin, Min Li, E. Lopez,and L. Van der Perre, “Energy efficient PRACH detector algorithmin SDR for0 LTE femtocells,” IEEE Symposium on Commun. andVehicular Technology (ISCVT), pp.1–5, Nov. 2011.

[6] M.S. Lee and Y.M. Choi, “An efficient receiver for preamble detec-tion in LTE SC-FDMA system with an antenna array,” IEEE Com-mun. Lett., vol.14, no.12, pp.1167–1169, Dec. 2010.

[7] J. Berkmann, C. Carbonelli, F. Dietrich, C. Drewes, and W. Xu, “On3G LTE terminal implementation — Standard, algorithm, complex-ities and challenges,” IEEE Wireless Commun. and Mobile Conf.(IWCMC), pp.970–975, Aug. 2008.

[8] D.C. Chu, “Polyphase codes with good periodic correlation prop-erties,” IEEE Trans. Inf. Theory, vol.IT-18, no., pp.531–532, July1972.

[9] S. Beyme and C. Leung, “Efficient computation of DFT of Zadoff-Chu sequences,” IEEE Electron. Lett., vol.45, no.9, April 2009.

[10] Panasonic, NTT DoCoMo, “R1-073624: Limitation of RACH se-quence allocation for high mobility cell,” 3GPP TSG RAN WG1,metting 50, Aug. 2007.

[11] H. Kamal, J.H. Lee, and B.T. Koo, “An improved non-CSD 2-bitrecursive common subexpression elimination method to implementFIR filter,” ETRI Jorunal, vol.33, no.5, pp.695–703, Oct. 2011.

[12] 3GPP, TS 36.104, “Base station (BS) radio transmission and recep-tion,” March 2009.

[13] 3GPP, TS 36.101, “User equipment (UE) radio transmission and re-ception,” March 2009.

Joohyun Lee was born in Daegu in SouthKorea. He received M.S. degree in electri-cal engineering from Pohang University of Sci-ence and Technology (POSTECH), Korea, in1998. Currently, he has been with ETRI, Ko-rea. His research interests include synchroniza-tion of OFDM receivers, multimedia broadcast-ing, and LTE-based FemtoCell technology.

Bontae Koo received the B.S. and M.S. de-grees in electrical engineering from Korea Uni-versity, in 1989 and 1991, respectively. He waswith Hyundai Electronics in Korea from 1991 to1997 and ASPEC in San Jose, USA, from 1997to 1999. He joined ETRI, Korea, in 1999. Cur-rently, he serves as the team leader for the Mo-bile Comm. and Broadcasting Convergence SoCResearch Team.

Hyuckjae Lee received the B.S. degree inelectronic engineering from Seoul National Uni-versity, Korea, in 1970, and the Ph.D. in electri-cal engineering from Oregon State University,Corvallis, in 1982, where he specialized in elec-tromagnetic fields and microwave engineering.Since 1983, he has been with the Radio Technol-ogy Department of ETRI and currently a profes-sor of Korea Advanced Institute of Science andTechnology (KAIST).