high design-level comparison of mimo baseband hardware architectures

25
Slide Slide 1 1 High Design-Level High Design-Level Comparison Comparison of MIMO of MIMO Baseband Hardware Baseband Hardware Architectures Architectures Steffen Paul Steffen Paul Infineon Technologies Infineon Technologies Munich Munich , Germany , Germany Markus Markus Rupp Rupp TU TU Vienna Vienna , INTHFT , INTHFT Vienna Vienna , , Austria Austria

Upload: others

Post on 03-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 11

High Design-Level High Design-Level ComparisonComparison of MIMO of MIMO

Baseband Hardware Baseband Hardware ArchitecturesArchitectures

Steffen PaulSteffen Paul

Infineon TechnologiesInfineon Technologies

MunichMunich, Germany, Germany

Markus Markus RuppRupp

TU TU ViennaVienna, INTHFT, INTHFT

ViennaVienna, , AustriaAustria

Page 2: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 22

ContentsContents

Current situation in designing modem hardware and futureCurrent situation in designing modem hardware and future

architecturesarchitectures

Modem parameters for MIMO HSDPAModem parameters for MIMO HSDPA

Modem architecture covering Release 4 to Release 6Modem architecture covering Release 4 to Release 6

Modem Modem subblocksubblock and their implementation issues and effort and their implementation issues and effort

ConclusionsConclusions

Page 3: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 33

AlgorithmsAlgorithms and and DSPsDSPs

Computational complexity of wireless systems grows faster than processorComputational complexity of wireless systems grows faster than processor

performanceperformance

Gap gets larger and largerGap gets larger and larger

Source: J. Source: J. RabaeyRabaey (UCB) and R. Subramanian ( (UCB) and R. Subramanian (MorphicsMorphics))

Page 4: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 44

Impact on System Impact on System ArchitectureArchitecturePurely DSP based systems (asPurely DSP based systems (as

e.g. in GSM) can only be realizede.g. in GSM) can only be realized

some time after the introductionsome time after the introduction

of a wireless standardof a wireless standard

Design of the receiverin dedicated hardwaretakes too much effort

Such solutions willhit the marketearly enough

Source: T. Noll, RWTH Aachen

Page 5: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 55

ExpectedExpected MIMO Modem MIMO Modem ArchitecturesArchitectures

MIMO modems will be built by a mix ofMIMO modems will be built by a mix of

dedicated HW blocks (HW accelerators) performing regular operationsdedicated HW blocks (HW accelerators) performing regular operations

at high data rateat high data rate

dedicated simple application specific processors for specific signaldedicated simple application specific processors for specific signal

processing task, e.g. pilot signal processing.processing task, e.g. pilot signal processing.

These processors interact only locally with HW blocks (e.g. HW 5, HW 6These processors interact only locally with HW blocks (e.g. HW 5, HW 6

with one being a processor) with the restriction not to occupy the bus.with one being a processor) with the restriction not to occupy the bus.

DSP dDSP HW 3 HW 4

Bus

HW 5 dDSP

Page 6: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 66

Standard Evolution and Modem Standard Evolution and Modem DevelopmentDevelopment

Advantages of such an architectureAdvantages of such an architecture

Early start with product developmentEarly start with product development

Optimization of computational demanding blocksOptimization of computational demanding blocks

Late changes in block interaction and fine-tuning of algorithmsLate changes in block interaction and fine-tuning of algorithms

Evolution of a wireless standard

Basic parameters

specified(e.g. CDMA system)

Step by step enhancements of

features, definition ofperformance requirements,

configurations etc.

Settling of performance

requirements

Product development

Architecture and datapathdefinition

Implementation ofalgorithms

Simple fixes: SW update(e.g. task scheduling,parameter estimation

)

Page 7: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 77

A Future MIMO SystemA Future MIMO System

MIMO is part of UMTS in MIMO is part of UMTS in RelRel. 6 together with HSDPA. 6 together with HSDPA

Given Parameters (based on 3 GPP TR 25.876)Given Parameters (based on 3 GPP TR 25.876)

Number of antennas Number of antennas up to four on up to four on TxTx and Rx side and Rx side

Sample RateSample Rate 7.68 M Samples/s with 7.68 M Samples/s with oversamplingoversampling factor 2 factor 2

ModulationModulation QPSK, 16 QAMQPSK, 16 QAM

Spreading codesSpreading codes up to 15 in parallelup to 15 in parallel

with fixed spreading factor 16with fixed spreading factor 16

Channel delay spreadChannel delay spread up to 3.7 up to 3.7 μμs s 30 half chips 30 half chips

Number of propagation paths between any two antennas not specified yetNumber of propagation paths between any two antennas not specified yet

Silicon ParametersSilicon Parameters

Clock frequency of chip 200 - 300 MHzClock frequency of chip 200 - 300 MHz

Page 8: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 88

Some Hardware Aspects beyond OP CountSome Hardware Aspects beyond OP Count

Degrees of freedom

Mapping of operations onto DSP or dedicated HW

Degree of parallelism, loop unrolling

Time multiplexing (resource sharing)

Complexity metric

Number of arithmetic operations

Reuse of blocks / parallelism, loop unrolling

Memory amount and number of read write operations

Amount bus communication

Power consumption, chip area

Page 9: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 99

CodingInterleaving

Mapping

DEMUX

...

Spreading Code 1

Spreading Code 2

Spreading Code C

Scrambling

Code

Scrambling

Code

CodingInterleaving

Mapping...

...

...

Highspeeddatastream

Antenna 1

Antenna T

Transmitter StructureTransmitter Structure

One proposal for UMTS Release 6 MIMO extension (PARC MIMO by Lucent)One proposal for UMTS Release 6 MIMO extension (PARC MIMO by Lucent)

Transmitter:Transmitter:

3GPP RAN WG1, R1-0109413GPP RAN WG1, R1-010941

Page 10: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1010

Receiver AlgorithmsReceiver Algorithms

ProPro Cons Cons

Matrix algebra basedMatrix algebra based - good performance - good performance - implementation complexity- implementation complexity

numerical stability of fixed numerical stability of fixed

point implementations point implementations

Zero forcingZero forcing

MMSE equalizationMMSE equalization

BLAST techniquesBLAST techniques

CorrelatorCorrelator based based - proven technology in - proven technology in RelRel. 4. 4 - limited performance - limited performance

- same HW for - same HW for RelRel. 4 and. 4 and

HSDPA MIMO extension HSDPA MIMO extension

- flexible design- flexible design

- lots of options for HW-- lots of options for HW-

mapping mapping

RAKE based receiverRAKE based receiver

Page 11: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1111

Proposed Iterative Receiver for PARC Proposed Iterative Receiver for PARC

As suggested in 3GPP RAN WG1, R1-010941As suggested in 3GPP RAN WG1, R1-010941

MuxMMSEdetectionforremainingantennawith highestSINR

Despread 1

Despread 10

Reconstructsignals forcancellation

Collectandmux

Detect,demap,deinter-leave,decode

Page 12: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1212

Low Complexity ReceiverLow Complexity Receiver

Reduced complexity receiver based on RAKE (without MRC)Reduced complexity receiver based on RAKE (without MRC)

and ML detectionand ML detection

Complexity proportional to RAKE fingers even under ML!Complexity proportional to RAKE fingers even under ML!

M transmitantennas

N receiveantennas

RF to

base

band

Channel

estimator,Finger

RA

KE

Turbo

decoder

ML

De

tecto

r

virtualantennas

Page 13: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1313

Low Complexity ReceiverLow Complexity Receiver

Processing of all receive paths as individual fingers without MRC as inProcessing of all receive paths as individual fingers without MRC as in

conventional RAKEconventional RAKE

N receive

antennas

RF tobase

band

Channel

estimator,Finger

ML

De

tecto

r

RAKE

Ant. 1, Finger 1

Ant. 1, Finger 2

Ant. 2, Finger 1

Ant. 2, Finger 2

Ant. 3, Finger 1

Ant. 3, Finger 2

Ant. 4, Finger 1

Ant. 4, Finger 2

Bu

ffer

Bu

ffe

rB

uff

er

Bu

ffe

r

RAKE-Finger

Despreading

PN-

Sequenz

Generator

Descrambling

Scrambling-

Sequenz

Generator

Integrate

DumpCode-

Tracking

(channel impulse response)

Page 14: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1414

RAKE HW ComplexityRAKE HW Complexity

Per RAKE finger

Optimal sampling point reconstruction

Descrambling 1 Bit multiplication

Despreading fixed spreading factor of 16: 16 complex adds

HW clock much higher than sample rate

Resource reuse possible

Parallel of fingers

Inefficient HW use

Logical fingersmapped onone physical finger

Finger 1

Finger 2Finger 3

Finger M

Symbol Time

Finger 1

Finger 2Finger 3

Finger M

Symbol Time

Page 15: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1515

RAKE HW ComplexityRAKE HW Complexity

Resource reuse possibilities:

Parameters: Spreading factor SF, HW clock frequenccy fclock, slot durationTslot, number of codes C

Gives reuse factor (C=1):

i.e. up to R = 26 logical fingers can be mapped onto one physical finger

Performance requirements suggest the combination of up to six paths

In total at maximum 4x6XC logical fingers required, if one finger processesone code (C = 1),

However, C=15 also possible prohibitively many fingers

Finger with multicode capability recommended (e.g. 3 codes in parallel),then 4x6xC/3 = logical fingers are required

With moderate clock frequency only 1-3 physical fingers needed

1* * 2 *

/

slot

clock

TSF R

f Symbols Slot=

200clockf MHz= 26R =

300clockf MHz= 39R =

Page 16: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1616

RAKE HW ComplexityRAKE HW Complexity

Number of logical multicode fingers for four antennas and QPSK only:

Number of physical multicode fingers @ 300 MHz

28.8120804015

2396643212

11.54832166

1.9241681

Data Rate(MBit/s)642

# propagation path per antenna# codes

43215

32112

2216

1111

642

# propagation path per antenna# codes

Physicalimplementationof a few fingersis required

Page 17: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1717

Rake HW EffortRake HW Effort

Multicode fingers

Transmit signal is the sum of bit streams with different spreading codes

Arrive at receiver after passing the same channel

Sample point reconstruction and descrambling are the same for all signalcomponents (codes) of one propagation path

RAKE-Finger

DespreadingDescrambling

Scrambling-Sequenz

Generator

Code-Tracking

(channel impulseresponse)

RAKE-Finger

Despreading

1

PNSequenz

Descrambling

Scrambling-Sequenz

Generator

-

(channel impulseresponse)

2 3

Integrate

Dump

Integrate

Dump

Integrate

Dump

Integrate

Dump

Integrate

Dump

Integrate

Dump

Page 18: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1818

Data BufferingData Buffering

Rather than forming subgroups of fingers assigned to individual antennas, allRather than forming subgroups of fingers assigned to individual antennas, all

fingers treated equally -> flexibility in assigning fingers to antennasfingers treated equally -> flexibility in assigning fingers to antennas

Sample streams of receive antennas write into the same buffer (pre-bufferingSample streams of receive antennas write into the same buffer (pre-buffering

a few samples)a few samples)

Data rate: write 4 x 7.68 MHz = 30.72 MHz write: M x 7.68 MHzData rate: write 4 x 7.68 MHz = 30.72 MHz write: M x 7.68 MHz

RF tobase

band

Channel

estimator,Finger

ML

De

tecto

r

RAKE

Ant. 1, Finger 1

Ant. 1, Finger 2

Ant. 2, Finger 1

Ant. 2, Finger 2

Ant. 3, Finger 1

Ant. 3, Finger 2

Ant. 4, Finger 1

Ant. 4, Finger 2

Bu

ffe

r

Ant 1Ant 2Ant 3Ant 4

Tc/2

write Read Finger 1 M

… …

Samples

from RRC

Write 4 and read M samplesWrite 4 and read M samples

@ 300 MHz @ 300 MHz M = 35 could M = 35 could

be supportedbe supported

Split into two buffers doublesSplit into two buffers doubles

MM

Page 19: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 1919

Sample Buffering: 2 SolutionsSample Buffering: 2 Solutions

Solution oneSolution one

Common buffer for all fingersCommon buffer for all fingers

Fingers run synchronouslyFingers run synchronously

Simplifies operation of ML detectorSimplifies operation of ML detector

Synchronous code generatorsSynchronous code generators

Buffer size (I,Q in separate buffers):Buffer size (I,Q in separate buffers):

#Antenna x OSR x (SHO + DS) #Antenna x OSR x (SHO + DS)

= 4 x 2 x (296+120) = 4 x 2 x (296+120)

= 3328 samples @ 8 Bit = 3328 samples @ 8 Bit

= 26 = 26 kBitkBit

Ring bufferCode

generators

Finger 1

Finger 2

Finger 3

Memory

Address

selected byfinger

placement

Page 20: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2020

Sample Buffering: 2 SolutionsSample Buffering: 2 Solutions

Solution twoSolution two

Symbol buffer at each fingerSymbol buffer at each finger

Code generator phase needs to be controlledCode generator phase needs to be controlled

Total buffer size (I,Q in separate buffers):Total buffer size (I,Q in separate buffers):

#Finger x (SHO + DS)/SF #Finger x (SHO + DS)/SF

= 120 x (296+120) = 120 x (296+120)

= 49920 symbols @ 8 Bit = 49920 symbols @ 8 Bit

= 390 = 390 kBitkBit

#Finger x (SHO + DS)/SF #Finger x (SHO + DS)/SF

= 60 x (296+120) = 60 x (296+120)

= 24960 symbols @ 8 Bit = 24960 symbols @ 8 Bit

= 195 = 195 kBitkBit

Much more memory due to large number of fingersMuch more memory due to large number of fingers

Lower RW-rateLower RW-rate

than solution onethan solution one

Symbol buffer

Channel path profile

CG 1 CG 2 CG 3

Start of code

generators (CG)

Finger 1

Finger 2

Finger 3

To

MIM

O d

ete

cto

Page 21: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2121

Basic Channel Estimation StructureBasic Channel Estimation Structure

TX and RX structure (TR 25.869)TX and RX structure (TR 25.869)Pilot Symbol

Pattern #1 (P1)

AA

COVSF1

COVSF2

Scrambling code

CSC

+

+

+

+

+

-

+

-h4

h2

h3

h11/g

1/g

hD

hC

hB

hA

Pilot Symbol

Pattern #2 (P2)

A-A or -AA

ha

hb

COVSF1

Scrambling code CSC

+

+

+

-

+

+

+

-

antenna #2

antenna #3

antenna #1

antenna #4

Gain

g

Gain

g

COVSF1

COVSF2

COVSF2

Pilot Symbol

Pattern #1 (P1)

AA

Pilot Symbol

Pattern #1 (P2)

A-A or -AA

X1

X2

X3

X4

Page 22: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2222

Channel Estimation HW EffortChannel Estimation HW Effort

Per AntennaPer Antenna

1 Descrambling1 Descrambling

2 2 DespreadingsDespreadings (code length 256 chips) (code length 256 chips)

4 Correlations with pilot pattern (AA, A-A)4 Correlations with pilot pattern (AA, A-A)

4 4 SmoothingsSmoothings

Split into two different tasksSplit into two different tasks

Delay profile estimation as input information for finger placementDelay profile estimation as input information for finger placement

descrambling over pilot sequence and pilot modulationdescrambling over pilot sequence and pilot modulation

Channel weight estimationChannel weight estimation

Use of Use of multicodemulticode RAKE finger for channel weight estimation RAKE finger for channel weight estimation

(depending on (depending on multicodemulticode capability up to 2) and capability up to 2) and postprocessingpostprocessing of output of output

(symbol modulation AA, A-A etc.)(symbol modulation AA, A-A etc.)

Page 23: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2323

ML DetectorML Detector

In principleIn principle

four antennas and QPSK: 256 possibilitiesfour antennas and QPSK: 256 possibilities

four antennas and 16 QAM: 65536 possibilitiesfour antennas and 16 QAM: 65536 possibilities

But reduction is possible by operating on a reduced point set first andBut reduction is possible by operating on a reduced point set first and

picking a number of n best candidates on which the ML search is donepicking a number of n best candidates on which the ML search is done

Typically, n= 1Typically, n= 1……2020

Reduction of lReduction of l22 norm to norm to

Rake finger output with N # antennas, Rake finger output with N # antennas, LLrr # of resolvable paths # of resolvable paths

� �2

argminj

jd

d r Hd=

rNL

r C

( ) ( )5 3

Re( ) Im( ) max Re( ),Im( )8 8

d d d d d+ +

Page 24: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2424

ConclusionsConclusions

Rake receiver based implementations for the specific requirements ofRake receiver based implementations for the specific requirements of

MIMO HSDPA in UMTS Release 6 is possible with moderate hardwareMIMO HSDPA in UMTS Release 6 is possible with moderate hardware

efforteffort

Number of fingers grows rapidlyNumber of fingers grows rapidly

Special handling of parallel use of up to 15 codes requires the use ofSpecial handling of parallel use of up to 15 codes requires the use of

fingers with fingers with multicodemulticode capability capability

Hardware structure of Release 4 receiver is a subset of the RAKEHardware structure of Release 4 receiver is a subset of the RAKE

MIMO architectureMIMO architecture

More advanced concepts, e.g. interference cancellation can be addedMore advanced concepts, e.g. interference cancellation can be added

to the Rake receiverto the Rake receiver

Page 25: High Design-Level Comparison of MIMO Baseband Hardware Architectures

Slide Slide 2525

ReferencesReferences

M. Rupp, G. M. Rupp, G. GritschGritsch, H. , H. WeinrichterWeinrichter: Approximate ML detection for MIMO: Approximate ML detection for MIMO

systems with very low complexity. Proc. ICASSP 2004, Montreal.systems with very low complexity. Proc. ICASSP 2004, Montreal.

D. D. SamardzijaSamardzija, P. , P. WolnianskyWolniansky, J. Ling: Performance evaluation of VBLAST, J. Ling: Performance evaluation of VBLAST

algorithm in W-CDMA systems. Proc. Vehicular Technology Conf. Fall, 2001.algorithm in W-CDMA systems. Proc. Vehicular Technology Conf. Fall, 2001.

R. Van Nee, A. Van R. Van Nee, A. Van ZelstZelst, G. , G. AwaterAwater. Maximum likelihood decoding in space. Maximum likelihood decoding in space

division multiplexing system. Proc. Vehicular Technology Conf. Spring, 2000.division multiplexing system. Proc. Vehicular Technology Conf. Spring, 2000.

A. A. AdjoudaniAdjoudani, E. Beck. et. al. Prototype experience for MIMO BLAST over, E. Beck. et. al. Prototype experience for MIMO BLAST over

third generation wireless system. IEEE Journal on Selected Areas inthird generation wireless system. IEEE Journal on Selected Areas in

Communications, Vol. 21, No 3, 2003.Communications, Vol. 21, No 3, 2003.