© 2000 altera corporation 3rd generation wireless technical solutions seminar
TRANSCRIPT
© 2000 Altera Corporation
3rd Generation Wireless Technical Solutions Seminar
Altera, ACCESS, ACEX, ACEX 1K, ACEX 2K, AMPP, APEX, APEX 20K, APEX 20KE, BitBlaster, ByteBlaster, ByteBlasterMV, Classic, ClockBoost, ClockLock, ClockShift, CoreSyn, E+MAX, EPC2, FastTrack, FineLine BGA, FLEX, FLEX 10K, FLEX 10KE, FLEX 10KA, FLEX 8000, FLEX 6000, FLEX 6000A, Jam, MasterBlaster, MAX 9000, MAX 9000A, MAX 7000, MAX 7000E, MAX 7000S, MAX 7000A, MAX 7000AE, MAX 7000B, MAX 3000, MAX 3000A, MAX, MAX+PLUS, MAX+PLUS II, MegaCore, MegaLAB, MegaWizard, MultiCore, MultiVolt, NativeLink, nSTEP, OpenCore, OptiFLEX, Quartus, SignalTap, and specific device designations are trademarks and/or service marks of Altera Corporation in the United States and other countries. Altera acknowledges the trademarks of other organizations for their respective products or services mentioned in this document, specifically: Adobe and Acrobat are registered trademarks of Adobe Systems Incorporated. BP Microsystems is a registered trademark of BP Microsystems. CTI PET Systems, Inc. is a trademark for CTI, Inc. Data I/O and UniSite are registered trademarks of Data I/O Corporation. HP-UX is a trademark of Hewlett-Packard Company. Mentor Graphics is a registered trademark and LeonardoSpectrum and ModelSim are trademarks of Mentor Graphics. Microsoft, Windows, Windows 98, and Windows NT are registered trademarks of Microsoft Corporation. Rochester Electronics is a registered trademark of Rochester Electronics, Inc. Sun is a registered trademark and Solaris is a trademark of Sun Microsystems, Inc. Synopsys is a registered trademark and FPGA Express is a trademark of Synopsys, Inc. System General is a registered trademark of System General. Altera products are protected under numerous U.S. and foreign patents and pending applications, maskwork rights, and copyrights. Altera warrants performance of its semiconductor products to current specifications in accordance with Altera’s standard warranty, but reserves the right to make changes to any products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Altera Corporation. Altera customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. The actual availability of Altera’s products and features could differ from those projected in this publication and are provided solely as an estimate to the reader.
Copyright © 2000 Altera Corporation. All rights reserved.
3
© 2000 Altera Corporation
Agenda
Altera Business and Technology Overview W-CDMA System Implementation with Altera Short Break Forward Error Correction Solutions Fundamental DSP building Blocks Altera Product Update
4
© 2000 Altera Corporation
Altera Business and Technology Overview
5
© 2000 Altera Corporation
The Programmable Solutions CompanyTM
High-Density CMOSProgrammable Logic
Devices
Intellectual Property Development Software
6
© 2000 Altera Corporation
Founded in 1983 $837 Million in 1999 Sales $272.8 Million in Q1 2000 1,500+ Employees 14,000+ Customers
Worldwide
Highlights
7
© 2000 Altera CorporationAltera Asian Technical Center
Penang, Malaysia
Worldwide Research & Development
European Technical Center– High Wycombe, U.K.
– IC, Software, and IP Design
– Focus on Telecommunications
Asian Technical Center– Penang, Malaysia
– IC Design and Test Engineering
– 62,000 Sq. Foot Facility Supports up to 350 Employees
8
© 2000 Altera Corporation
Worldwide Manufacturing Capacity
World-Class Wafer Foundries– Sharp, TSMC, WaferTech
– Continuity of Supply
State-of-the-Art Development Partnership with TSMC
– 0.42-µ, 0.3-µ, 0.22-µ and 0.18-µ Processes Released to Production
– 0.15-µ and 0.13-µ Processes in Development
– Debug and Qualify Process at TSMC, Transfer to WaferTech
WaferTech Joint Venture Fab
– Located in Camas, Washington
– 0.42-µ, 0.3-µ and 0.22-µ Processes Released to Production
9
© 2000 Altera Corporation
Revenue by Market Segment
Consumer3%
Other4%
Communications66%
Industrial11%
EDP16%
1999 Total$836.6M
10
© 2000 Altera Corporation
Altera Communications Solutions
Telecom Networking Mobile
Communications Broadcast & Studio
All Areas of Communications
11
© 2000 Altera Corporation
I/O
Usable Gates
FLEX 10KFLEX 10KFLEX 6000FLEX 6000FLEX 8000FLEX 8000
APEX 20K®
Component Overview
MAX 9000MAX 9000MAX 7000MAX 7000MAX 3000MAX 3000
®
ACEX 1KACEX 1KACEX 2KACEX 2K
12
© 2000 Altera Corporation
APEX: Multi-Million-Gate Device
Up to 1.5-Million Usable Gates
2.5-V and 1.8-V Family
125-MHz System Performance
Up to 442 Kb of RAM
Content Addressable Memory
(CAM)
High-Performance PLL
High-Performance I/O
13
© 2000 Altera Corporation
ACEX: High Performance at Low Cost
0.22-/0.18-micron Hybrid Process 2.5-V Core and 5-V Tolerant I/Os Up to 100K Usable Gates 49Kb of Dual-Port RAM 64-Bit, 66-MHz PCI Compliant PLL Support
0.18-micron 6LM Process 1.8-V Core Up to 150K Usable Gates Dual-Port RAM Blocks High-Performance PLL Advanced I/O Standard
ACEX 1K ACEX 2K
14
© 2000 Altera Corporation
Price Trend
Price per Logic Element40% Lower per Year
0.354
0.578
0.901
1
0.261
0.037 0.0310.0420.0460.0550.0690.0860.1320.1440.17
0
0.2
0.4
0.6
0.8
1
1.2
1993 1993 1994 1994 1995 1995 1996 1996 1997 1997 1998 1998 1999 1999Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3 Q1 Q3
LU
T-B
ased
Sal
es O
ut/
LE
s (N
orm
aliz
ed t
o Q
1 19
93)
2000Q1
15
© 2000 Altera Corporation
Intellectual Property
120+ Cores Optimized for Altera Devices Fully Tested Easy to Customize Development Board Free Evaluation using OpenCoreTM
Program Communications Focus 30 Development Partners
16
© 2000 Altera Corporation
Quartus™ Development Software
Multi-Million-Gate Design Workgroup-Based Computing Integration with Third-Party Tools Seamless Integration of IP Cores SignalTap™ Logic Analysis Internet Aware Synopsys FPGA ExpressTM
Software Mentor Graphics
LeonardoSpectrumTM Software Mentor Graphics ModelSimTM
Software
17
© 2000 Altera Corporation
System-on-a-Programmable-Chip Solution
18
© 2000 Altera Corporation
0
0.05
0.01
0.15
0.02
0.25
0.30
1998 1999 2000 2001 2002 2003 2004
Dra
wn
Ga
te W
idth
(M
icro
ns
)
0.18
0.150.13
0.100.12
0.22
10MT/cm2
100MT/cm2
Process Geometry Migration
19
© 2000 Altera Corporation
Technology Migration
Delay
Gate Delay
Interconnect Delay
Gate Delays
20
© 2000 Altera Corporation
Aluminum
TechnologyMigration:Smaller/CloserInterconnect
Copper
40% Less Resistance Smaller Wires Carry
Same Current More Interconnect in
Same Space
Aluminum-Copper Interconnect
21
© 2000 Altera Corporation
1
10
100
1,000
10,000
1985 1990 1995 2000 2005
Usable Gates
(K)
Programmable Logic Capacity
APEX™ 20K
FLEX 10K
MAX 7000
MAX® 5000
FLEX® 8000
Classic
22
© 2000 Altera Corporation
Time
Density
Available ASIC Gates far exceed average gate utilization
ASIC vs. PLD DensityAvailable ASIC
Gates
Used ASIC Gates
Available PLD Gates
Today, PLD Density Addresses Mainstream Designs
23
© 2000 Altera Corporation
6-inch Wafer0.6 µ
8-inch Wafer0.25 µ
12-inch Wafer0.15 µ
Minimum Order Quantities
As Technology and Wafer Sizes Change, We Get More Net Die Per Wafer
30x1x 200xTypical Net DiePer Wafer:
This Leads to an Issue of Minimum Order Quantities...
24
© 2000 Altera Corporation
CMOS Digital Logic Market
PLD10%
Standard Cell41%
Gate Array13%
Standard Logic9%
Custom IC4%
Other Logic23%
1999$24.2B
Standard Cell does not include Mixed-Signal ASICs.
Source: Dataquest
25
© 2000 Altera Corporation
System on a Chip(Cell-Based IC)
System on a Programmable Chip
(PLD)
Time-to-Market DrivenFlexibility
Infrastructure Products
Volume DrivenLowest Unit CostSmall Form Factor
Consumer-Oriented Products
The Future of System Design
26
© 2000 Altera Corporation
Altera DSPAltera DSPAltera DSPAltera DSP
Altera PLD vs. DSP
Flexibility
Performance
DSPDSPProcessorProcessor
DSPDSPProcessorProcessor
ASSPsASSPsASSPsASSPs ASICsASICsASICsASICs
Flexible, But LacksReal-Time Performance
Flexibility of DSP, with Performance of ASIC
High Performance, but Inflexible
27
© 2000 Altera Corporation
Implementing FIR Filter with Altera
20
40
60
80
100
Typical DSP
Processor
Altera Parallel FIR
Implementation
Altera Serial FIR
Implementation
MIPS/$
Ten Times More
Cost Effective
This benchmark is for a symmetric FIR filter (50 taps, 8-bit data, 12-bit coefficient resolution) targeting an Altera EP20K100 APEXTM device and a TI TMS320C54x (50 MHz) DSP processor
10x High-performance cost advantage over standard DSP processors
Shorten complex, high-speed FIR filter design time from six weeks to just one day
© 2000 Altera Corporation
FLEX / APEX - LUT Multiplication Benefits LUT Operation: “Any Function of 4 Inputs Can Be Done” Pre-computed Values Loaded in LUT No Need for Full Multiplier
LE OutLook-up
Table[LUT]
CascadeChain
PRND QCLK
CLRN
DATA1
DATA2
DATA3
DATA4
Cascade In
CascadeOut
Load Pre-ComputedFilterCoefficients
Cascade&
CarryChain
Carry In
CarryOut
1 0 1 00 0 1 11 1 0 01 1 1 0
29
© 2000 Altera Corporation
Summary
30
© 2000 Altera Corporation
W-CDMA System Implementation with Altera
© 2000 Altera Corporation
Global Wireline/ Wireless Market Trends 1995-2010
0
200
400
600
800
1,000
1,200
1,400
1,600
1996
1998
2000
2002
2004
2006
2008
2010
Su
bsc
rib
ers
-- I
n M
illi
on
s
Global Wireline
Global Wireless
Global Wireless (Revised)
© 2000 Altera Corporation
Wireless/Wireless Data Growth
12
Global Market Trends
Su
bs
crib
ers
In M
illi
on
s
Total Wireless
Wireless Data
199
5
200
400
600
800
1,000
1,200
1,400
1,600
200
0
200
5
201
0
Total Wireless/Wireless Data Majority of growth will be in Wireless Data (image, video and online service)
Online services will be the key driver of wireless data growth
As the internet evolves, new data applications will surface
33
© 2000 Altera Corporation
IMT-2000
Up to 2 Mbps Data Global Roaming High-Quality Multimedia
- Internet Access- Video Conference
~1990 Mid-90s 2000
Mobile Multimedia: IMT-2000
Roadmap to Third Generation–“Mobile Internet”
Pe
rfo
rma
nc
e
AMPS (N.A.)TACS (Europe)
NTT TACS (JPN)
AnalogAnalog
Improvement in Frequency Utilization
N-CDMA TechnologyN-CDMA TechnologyIS-95 (N.A., Korea, Hong Kong, Japan)
TDMA TechnologyTDMA TechnologyIS-54/136 (N.A.)GSM (Europe)PDC (Japan)
DigitalDigital
Improvement in Frequency Utilization & Support for
Multimedia Communication W-CDMA TechnologyW-CDMA TechnologyWide-band CDMA
CDMA 2000UWC-136
MultimediaMultimedia
© 2000 Altera Corporation
International Mobile Telephony for the year 2000 (IMT-2000)
ITU - FPLMTS (IMT2000)
IMT 2000
International Roaming Mobile & Fixed Services Security & Privacy
Global Personal Mobility
Land Mobile
Satellite
Public/Private Landline
• Multiple Services
• Increasing Data Rate
• Unification of Multiple
• Standards
35
© 2000 Altera Corporation
Programmable Solution - The Right Solution for 3G
Demand for 3rd generation service is not known– Flexible h/w needed to address market dynamics
Multiple versions of 2.5G and 3G are expected in different regions – Common platform for economy of scale
Competitive pressures and higher capacity requirements will push to implement smart antennas, multi-user detection (MUD), and space-time adaptive processing (STAP) algorithm– Calls for flexible platform– Generic DSP do not sufficient processing power
36
© 2000 Altera Corporation
3G W-CDMA Overview
Acronym: ITU calls it IMT-2000; ETSI calls it UMTS Main Drivers:
– Higher Data rate services and better spectrum efficiency– Full coverage and mobility for 384Kbps– Limited coverage and mobility for 2Mbps– High spectrum efficiency compared existing systems– High flexibility to introduce new services
37
© 2000 Altera Corporation
3G W-CDMA Overview (cont.)
DS-CDMA selected as air interface Two modes of operation defined:
– FDD - for paired band– TDD - for unpaired band
For FDD bands, two standards defined: – W-CDMA– cdma2000
For TDD bands, one standard defined: TDMA/CDMADown link
FDD
TDD
DL UP DL UP
Up linkDuplex separation
Time
Bas
e S
tatio
n
38
© 2000 Altera Corporation
Air Interface Details
Bandwidth: 5Mhz, 10Mhz, 20MhzChip rate: 3.84McpsFrame size: 10msModulation: QPSK Spreading factor: FDD Uplink: 256 4
Downlink: 512 4 TDD: Uplink & Downlink: 16 1
FDD:Downlink: 2110 - 2170MhzUplink: 1920 - 1980Mhz
TDD: 1900 - 1920 Mhz 2010 - 1980 Mhz
39
© 2000 Altera Corporation
W-CDMA Signal Generation (Uplink/Downlink)
Complex Spreading
(DL)
HPSK Spreading
(UL)
Data
0101Add CRC Bits
Add FEC Bits
Inter-leaver
I/Q Mod.
FIR Filter
FIRFilter
RF Out
OVSF Code Generator
ComplexScrambling
“Gold Code”
BS ID or UE ID
User ID or Channel ID
ErrorDetection
ErrorCorrection
Orthogonal Spreading
Scrambling
Spectral Containment
Modulation & Upconversion
40
© 2000 Altera Corporation
W-CDMA Transmitter Architecture
Data
0101
Turbo Encoder
CRC
FEC
Convolutional Encoder Block
Interleaver
OVSF Code Generator
Scrambling (PN) Code Generator
“Gold Code” Complex Spreading
I/Q
Map
per RRC Filter
RRC Filter
Interpolation
Interpolation
NCOQPSK Modulator
~
DAC
COS (2ft)
Base Band Transmit
Filter
A
AB
B
41
© 2000 Altera Corporation
W-CDMA Receiver Architecture
I/Q
DE
MO
D
COS (2ft)
BPF
~
ADC
Multipath Estimatordelay phases
CRC
De-Interleaver
Viterbi/ Turbo Decoder
Error Indication
01100101A
Data
De-spreading
Multi-user
detector
MultipathCombiner
...
...
...
A
Chan. est + sym dec.
42
© 2000 Altera Corporation
Altera Solution for Key 3G Blocks
CRC
FEC coresConvolutional Encoder
Turbo EncoderViterbi DecoderTurbo Decoder new
PN Generator
Modulator RRC filter Interpolator NCO new
Binary Correlator• Sequential Correlator• Parallel Correlator
Demodulator
43
© 2000 Altera Corporation
Cyclic Redundancy Check
3G Specification
gCRC24(D) = D24 +D23+D6+D5+D+1
gCRC16(D) = D16 + D12 +D5 + 1
gCRC12(D) = D12 + D11 +D3 + D2 + D +1
gCRC8(D) = D8 + D7 + D4 + D3 + D + 1
Use Altera’s CRC Megafunction: Fully parameterized, including:
– Any length generator polynomial– Input data width, from 1 bit to the width of the polynomial– Any initial value
Solution
Meets 3G
Requirements!
!
44
© 2000 Altera Corporation
3G Specs: Convolutional Encoder
G0
G1
InputData Bits
= 561 (octal)
= 753 (octal)
Rate 1/2 Convolutional encoder
Convolutional Encoder Representation
3G Specification
Base-station: K=9 and rate = 1/2 and r=1/3 Mobile: K=9 and r=1/3
45
© 2000 Altera Corporation
Convolutional Encoder Solution
Use Optimized basic building blocks from Altera: LPM LPM_SHIFTREG
– Set # of shift registers to 8:
– Other options to select: a) parallel loading b) LE implementation
LPM_XOR
Implementation Details
Results
LE Count: 12 LE Performance: 250 MHz Device: Easily fits into EP20K100E
Meets 3G
Requirements!
!
46
© 2000 Altera Corporation
Convolutional Encoder Schematic
47
© 2000 Altera Corporation
3G Specs: Viterbi Decoder
Viterbi Decoder Representation
Base-station: K=9 and r=1/3 Soft decision decoding Used to give BER 10-3 for voice
3G Specification
Decoder Control
TraceBackPath
MetricStorageMemory
Add, CompareSelect
Branch metricscomputation
Symbol
ValidSysClk
RESET
RR
48
© 2000 Altera Corporation
Viterbi Decoder Solution
Use Altera’s Viterbi Megafunction:
1) VITTOPA - external unpuncturing or for unpunctured codes
2) VITTOPB - internal depuncturing
Implementation Details
Results
Meets 3G
Requirements!
!
Implementation
Serial
Serial / Parallel
PerformanceLEs Speed
1300 500kb/s
2600 2Mb/s
49
© 2000 Altera Corporation
3G Specs: Turbo Encoder
Parallel Concatenated Convolutional Code (PCCC) with two 8-state constituent encoders and a interleaver Block Size: 40 - 5114 bits Puncturing: Rate=1/3 (no puncturing)
Rate=1/2 (puncturing)
3G Specification
Encoder 1
Encoder 2InterleaverPuncture
ParityInput
Turbo Encoder Representation
Data
50
© 2000 Altera Corporation
Turbo Encoder Solution
Use Altera’s Turbo Encoder Megafunction
Generic, easy-to-use interface
UMTS compliant interleaver included
Implementation Details
Results
Meets 3G
Requirements!
!
Turbo Encoder requires 3000 LEs ESB = 10 (suitable device EP20K100E)
51
© 2000 Altera Corporation
3G Specs: Turbo Decoder
Turbo Decoder Representation
Decode punctured and unpunctured Turbo codes Block Size: 40 - 5114 bits Puncturing: Rate=1/3 (no puncturing)
Rate=1/2 (puncturing)
3G Specification
Decoder 1
Decoder 2
Interleaver De-InterleaverDe-
punctureParity
OutputData
52
© 2000 Altera Corporation
Turbo Decoder Solution
Use Altera’s Turbo Decoder Megafunction: Max-logMAP decoder for maximum performance Includes UMTS specific interleaver Tailor decoder to system requirements with parameters Memory Bank Swap mechanism for increased throughput
Implementation Details
Results Turbo decoder requires 5000 to 7000 LEs Required memory size depends on the number of
softbits used; e.g for softbits = 5, ESB = 138 (suitable device EP20K600E) On-chip Alpha and Parity memory
Output Data Rate 2 Mbit/s
Meets 3G
Requirements!
!
53
© 2000 Altera Corporation
3G Specs: Pseudo Noise Generator
Pseudo Noise Representation
Downlink: – 38,400 chips of 218 Gold code– 512 different scrambling code– Grouped for efficient cell search
3G Specification
I
Q
Uplink:– Long Code: 38,400 chips of 225 Gold code– Short Code: 256 chips, very large Kasami code
54
© 2000 Altera Corporation
Pseudo Noise Generator Solution
Use Optimized basic building blocks from Altera: LPM LPM_SHIFTREG
–Set number of shift registers to 18–Other options to select: a) Parallel loading b) LE implementation
LPM_XOR–multiple XORs with different number of inputs
Implementation Details
Results
Meets 3G
Requirements!
! Number of LEs: 43 (1% of EP20K100E)
55
© 2000 Altera Corporation
Total LE: 43 (1% of EP20K100E)
PN Generator Schematic (Downlink)
56
© 2000 Altera Corporation
Traditional IF-Based Transmitter
Issues with traditional transmitter– Not flexible to support multiple standards
– Non-ideal local frequencies are source of noise
– RF and analog components of radio are more difficult to manufacture and have higher reliability issues
– Higher cost
BBfilter
BBfilter
+90ºf1
IFfilter
RFfilter
f2
Amp PA
I
Q
57
© 2000 Altera Corporation
Digital I/Q - Modulator S
ym. M
appe
r
BBfilter
Sin/CosLUT
BBfilter
f2
Amp PA
IFfilter
RFfilterDAC
Digital
Advantages of Digital I/Q Modulator– Channel selection can be done in digital domain
– Higher precision in frequency selection and shorter settling time of DDS
– Good amplitude and phase balance
– Extremely linear phase and very low shape factor of base-band filter
58
© 2000 Altera Corporation
3G Specs: QPSK Modulation
S/P
I/Q
Mapper
RRCfilter
RRCfilter
Interpolator
Interpolator
NCO
QPSK Modulator Representation
3G Specification
Nyquist filter:– Root Raised Cosine filter– = 0.22– sampling rate: 3.84Mcps X 4
NCO– 60 Mhz bandwidth for channel
mapping– High SFDR
59
© 2000 Altera Corporation
QPSK Modulator Solution
Use Altera’s Megafunctions and LPMs:
1) FIR Compiler to create RRC filter
2) NCO Compiler for NCO
3) LPM_MULT for Digital Mixer
Implementation Details
Results
LE Count: 2092 LE ESB Bit Count: 49152 Performance: 115 MHz Device: Easily fits into EP20K100E
Meets 3G
Requirements!
!
60
© 2000 Altera Corporation
QPSK Modulator Schematic
61
© 2000 Altera Corporation
FIR Compiler - Root Raised Cosine
Sampling frequency
Filter type
No. of Taps
Cutoff freq..
62
© 2000 Altera Corporation
NCO MegaWizard
Phase input to change frequency
No. of Output bits
Both Sin & Cos outputs
63
© 2000 Altera Corporation
NCO MegaWizard (cont..)
NCO clock speed & Output freq.
Verilog testbench for simulation
Estimated resource usage
64
© 2000 Altera Corporation
Multiplier MegaWizard
No. of input bits
Optimal implementation for constant multiplier
Signed or Unsigned implementation
65
© 2000 Altera Corporation
Multiplier MegaWizard
Pipelining to conserve area
Optimize for area or speed
66
© 2000 Altera Corporation
RAKE Receiver
Multi-path estimator estimates the tap delays for the different RAKE fingers to track different multi-path components
Sliding correlator based estimator requires a dumping period to calculate one delay tap power and therefore dedicated DLL required for each finger for fast tracking
With large APEX devices, can implement full matched filter eliminating need for DLLs
Coarse delay est.
RAKE finger with DLL
RAKE finger with DLL
RAKE finger with DLL
RAKE finger with DLL
Combiner
Delays, sync. lost
ind.Tap
delays
Combined narrowband
signal
67
© 2000 Altera Corporation
Complex Amplitude Estimation
Complex amplitude estimator is part of the RAKE receiver and is required for coherent detection.
WMSA channel estimation filter extends the observation interval over several slots WMSA performs better than interpolation filter
Correlator(data chn.)
DMUX
WMSA
delay
narrowband coherent
signal
wideband I/Q signal
WMSA - Weighted Multi-slot Averaging
Pilotsymbols
Datasymbols
(.)pN
1
68
© 2000 Altera Corporation
Multi-user Detection / Interference Cancellation
MLSE is too complex for practical DS-CDMA systems Most of the proposed detectors can be classified in one of the two
categories: Linear Interference Cancellation (IC) or Subtractive IC Most promising scheme is Groupwise SIC (GSIC) Users grouped according to their SF; PIC or decorrelator is applied
within the group. Their MAI is subtracted from the MF outputs of other users. This is done successively with higher SF groups.
Users with lowest SF is subtracted from the MF outputs of the other users. This is done successive till the highest SF group
OptimalMLSE
Sub-optimal
Linear
Interferencecancellation
Multi-userreceivers
DecorrelatorMMSE
MAI Whitening
PIC (WB, NB)SIC
Decision Feedback
MLSE - Maximum LikelihoodSequence EstimateMMSE - Minimum MeanSquare EstimateMAI - Multiple AccessInterferencePIC - Parallel Interference CancellationSIC - Successive Interference Cancellation
69
© 2000 Altera Corporation
Wide-band SIC
Block LevelRepresentation
I&Da1
I&Da2
I&DaN
SelectMax
Sign
_r
amaxej
Zmax
com
bine
rRegenator correlator
Regenator correlator
Regenator correlator
Hard dec.
com
bine
r
Regenator correlator
Regenator correlator
Regenator correlator
Hard dec.
code
code
code
code
code
code
DetailRepresentation
Wb(t)
B1,N-1(t)
BK,N-1(t)
ResidualSignal
B1,N(t)
BK,N(t)
70
© 2000 Altera Corporation
Wideband SIC (cont..)
Based on the concept of signal regeneration
Regenerated wideband signals for all the users are subtracted from the input signal to produce Residual signal
Residual signal does not have signal energy from any of the tracked users
Residual signal is added back to the individual multipath component of a given user to give a cleaned wideband signal for the user
This scheme requires spreading code of different users as well as their multipath delays
This scheme can be used with both long and short spreading code
71
© 2000 Altera Corporation
Narrowband SIC
Generated wideband signal is despread with cross-correlated sequence and negated from narrowband signal
Requirement for matrix inversion of square matrix of size KxL at the speed of symbol rate (K= No. of users, L= no. of multipath taps per user)
Because of implementation complexity, feasible solution if using short scrambling code
Narrow-band SIC
I&D
a1
I&D
a2
I&D
aN
SelectMax
Sign
r
amaxej
Zmax
-
-
Calc.Cross-corr.
...aN
a2
a1
c1,2
c1,N
72
© 2000 Altera Corporation
Binary Correlator
Key building block in a DS-CDMA system. Primitive function of a RAKE receiver Multiple uses of a Correlator:
– Pilot PN sequence
– Search for Multi-path echoes
– phase and timing information to a tracking loop
Two types of correlators:– Sequencing Correlator
– Parallel Correlator
Nova Engineering Inc. has both the IPs and can customize them for your applicationNova Engineering Inc.E-mail: [email protected]: http://www.nova-eng.com
73
© 2000 Altera Corporation
Sequential Correlator
Each incoming sample multiplied by PN sequence which advances at the chip rate Result is accumulated over the period of T symbols.
– Integrator is dumped at the end of the period and restarted
Example:– 4 correlators in parallel will require 64 searches to find a 256-bit pilot sequence maximum search time of 64X256
bits = 16,384 bit period
PN SequenceGenerator
dump
din
clock
corr_sum[0..m]
74
© 2000 Altera Corporation
Parallel Correlator
Data samples are held in a long shift register; Pilot PN sequence is held in Reference Pattern Register Contents multiplied and integrated each time a new sample is loaded into the shift register Example:
– 256-tap, 4 bit wide parallel correlator– With input clock frequency of 20 Mhz, new correlation sum produced every 50 ns 20 Giga-Operations Per Second.– Results: 3800 LEs, Fmax= 45Mhz (EPF10K200S)
Reference Pattern Register
clock
corr_sum[0..m]
Data Shift Register
Correlation Array
din
dref[0..n]
ref_en
dout
75
© 2000 Altera Corporation
Echo Canceller
76
© 2000 Altera Corporation
Basics of Echo Cancellation
Telephone network made up of two major components:– 4-wire or toll network
– 2-wire or local network
2-wire to 4-wire conversion process (using transformers) leaks signals creating echo
Round trip echo path delay can range from few milliseconds to hundreds of milliseconds (satellite hops)
Echo Cancellation is required to suppress the echo and maintain a clean channel
Echo Cancellers are adaptive filters which model the echo path to generate an estimated echo and subtract it from the real signal.
77
© 2000 Altera Corporation
Evolutionary GSM Core Network
USIM ME
RNC
RNC
3G MSC/VLR
SCP HLR
GGSN3G SGSN
PSTN/ISDN
Packet data network(internet, X.25,..)
BTS
BTS
BTS
BTS
BTS
Echo Cancellation required
Key Specification 64ms minimum delay Varying tail lengths
78
© 2000 Altera Corporation
Version 2.0: Complete Echo Canceller Solution (Black Box)
Echo Canceller
External SRAMHost
Processor
TDM Line(Hybrid side)
TDM Line(Network side)
TDM_ClkFsync
NRxD
NTxD
HTxD
HRxD
Address[7:0]
Data[7:0]/OE/WE/CS
EM_Addr[14:0]EM_Data[47:0]EM_/OEEM_/WREM_/CS
/ResetEC_Clk
Slow_Clk
79
© 2000 Altera Corporation
Echo Canceller MegaWizard
80
© 2000 Altera Corporation
Advantages of PLD Solution
Unique, Optimized and Highly Efficient Architecture
Flexible Data and Coefficient Width (accuracy)
Parallel processing using multiple MAC’s -
Overcoming the Typical One MAC DSP Bottleneck
On chip multi-channel CODECs Interface -
Overcoming the I/O Bottleneck
Low Cost High Performance Solution
81
© 2000 Altera Corporation
Key Technology Benefits
Flexible Architecture Compliance with ITU G.165 and G.168 standards
Multi Channel up to 768 Channels on one chip
Tail Length up to 256 msec
Handles maximum delay of 128 ms
u-Law/ A-Law Support
Reduction of Board Size
Low Power
Improved System Reliability
Fast Convergence
Double Talk Support
FAX Tone Detection
DTMF detection
Reduction of System Cost
82
© 2000 Altera Corporation
Conclusion
Programmable solution is the right solution for 3G Only provider of Turbo coder/decoder IP IPs are configurable; can accommodate future changes
to the standards Integrated design environment with automation (e.g. FIR
and NCO Compiler) is the FASTEST time to market solution in the industry.
APEX features make it feasible to effectively do high performance optimal designs
Altera has the most complete solution for 3G
83
© 2000 Altera Corporation
Forward Error Correction Products
84
© 2000 Altera Corporation
Agenda
Viterbi Decoder– Features– Functional Description– Parameters and Deliverables– Performance– Roadmap
Turbo Encoder/Decoder– Features– Functional Description– Parameters and MegaWizard– Performance
85
© 2000 Altera Corporation
Viterbi Decoder
86
© 2000 Altera Corporation
Viterbi Decoder Features
Supports range of constraint lengths and rates Meets 3G requirements of K=9 and r=1/2 and r=1/3 Supports punctured data
– high speed decoder can use external or internal (self synchronizing) depuncturing
– low speed decoders use external depuncturing
MegaWizard™ Plug-In for easy parameterization AWGN testcase generator for performance testing BERT circuitary included with core
87
© 2000 Altera Corporation
Convolutional Encoder
Uses shift registers that contain a history of the bit streams
Simple example of Conv. Encoder
State Diagram of Convolutional Encoder
00
0110
11
IN
OUT2
OUT1
88
© 2000 Altera Corporation
Trellis Diagram
00 00
10
00
10
11
01
00
10
11
01
00
10
11
01
A trellis is a method of showing the state machine over time
Every Incoming symbol has a path in the trellis
Each Branch has an accumulated sum of difference of received symbol to expected symbol
After a certain number of states have been passed the decoder implements a “Traceback” and confirms the valid decoded path as the path with least accumulated difference
89
© 2000 Altera Corporation
Viterbi Decoder Block Diagram
Decoder Control
TraceBackPath
MetricStorageMemory
Add, CompareSelect
Branch metricscomputation
Symbol
ValidSysClk
RESET
RR
90
© 2000 Altera Corporation
Soft decision information
Why use Soft bits?– Radio and Cable systems are analog so at
some point require an ADC– Soft decision bits are extra precision bits from
the Analog domain– Since we calculate difference from an
expected value if we can pass more of the analog information through to the decoder, the more accurate the end result will be
– Soft bits are a method of expressing the “strength” of a received “0” or “1”
– Use “signed style” to indicate the likely bit value and it’s strength
• msb is likely bit value• lsbs are strength (use 2’s complement
form)
0111 - Most likely 0011001010100001100100001 - Least Likely 00000 - Erased1111 - Least Likely 1111011011100101110101001 - Most Likely 1
91
© 2000 Altera Corporation
Code Rates and Puncturing
Convolutional encoders add additional information to the encoded bit stream. This is always at a fixed integer value.
A rate 1/2 encoder produces 2 output bits for every input bit Since this represents a massive transmission overhead it is also possible to
“puncture” the code (or simply remove bits) before transmission to save bandwidth. The puncturing scheme below is 2/3 rate puncturing and is based on rate 1/2 coding Puncturing schemes have a pattern which determines which bits are removed
0 1 0 1 0 0 0 0 0 1 0 1 0 1 0 1 1 0 1 0 1 0
1 0 0 1 1 0 1 0 0 0 1 1 1 0 1 0 1 0 1 1 1 0
Decoder replaces punctured bit locations with “0” before decoding– Requires external synchronisation to indicate punctured positions
Rates 2/3, 3/4, 4/5, 5/6, 6/7 7/8 supported
Removed Information bits marked in red
92
© 2000 Altera Corporation
Convolutional Coding Summary
The encoder is simple to implement– That’s why we don’t market a MegaCore Function
The decoder is far more complex and requires much arithmetic and memory to work effectively
State machine complexity is determined by “constraint length” which has a big impact on design size and speed
Trade-offs required in architecture to allow large constraint length Viterbi decoders to be used in Programmable Logic
93
© 2000 Altera Corporation
Viterbi Decoder MegaCore Functions
High Speed Parallel – Pure Logic Implementation – Performance : over 100 Mb/s– Fully Parallel Operation– Hard Decision and Soft Decision
Hybrid Serial/Parallel– Mixed Serial / Parallel Implementation– Typical Performance : 1-8 Mb/s– Medium Logic area
Low Speed Serial– Memory based architecture– Typical performance : 500kb/s - 3Mb/s
94
© 2000 Altera Corporation
Parameters
Fully user Parameterized– n - number of code bits (2 to 4)
– L - constraint length (3 to 9)
– v - traceback length
– bmg - branch metric width
– rate - puncturing rate
– gx - generator polynomials Core is completely parameterized in HDL
– All trellis connections synthesized automatically
– All branch metrics calculated automatically
95
© 2000 Altera Corporation
BER Testing and Verification
Testing of Viterbi Decoders requires large complex test cases Two utilities included with core to generate and analyze test cases Stimulus and Channel noise:
– Completely paramaterizable
– Generates a random bitstream, encodes it, and adds errors
– Writes out Quartus or Max+PlusII vector file
– Writes out differences between transmitted and received symbols
Error Correction Performance:– Extracts test results from Quartus or Max+PlusII simulation
– Returns Bit Error Rate (BER), by comparing transmitted bits with decoded bits
Test system level performance and choice of parameters of Viterbi decoder without the need to compile
96
© 2000 Altera Corporation
Viterbi Decoder Deliverables
Encrypted *.tdf design files Executable for setting up vector files and test cases
– vvece.exe
Executable for analysing the results of simulation– vtblaa.exe & vtblab.exe
User guide with examples
97
© 2000 Altera Corporation
Performance
Performance of different architectures
CONSTRAINT LENGTH = 5
CONSTRAINT LENGTH = 7
CONSTRAINT LENGTH = 9
500 LEs 700 LEs 1300 LEs3Mb/s 2Mb/s 500kb/s
1300 LEs 2600 LEs8Mb/s 2Mb/s
1500 LEs 7500 LEs100+Mb/s 100Mb/s
SERIAL
SERIAL/PARALLEL
PARALLEL
98
© 2000 Altera Corporation
Turbo Encoder/Decoder
99
© 2000 Altera Corporation
Turbo Codec in a PLD Article
100
© 2000 Altera Corporation
Turbo Encoder/Decoder Features
Compliant with 3rd Generation Partnership Project (3GPP); Technical Specification Group Radio Access Network; Multiplexing and Channel Coding (FDD) (3G TS 25.212 version 3.1.0)
High-performance max-logMAP (logarithmic ‘maximum a posteriori’) decoder for maximum error correction
Data rates in excess of 2 Megabits per second (Mbps)
Includes 3GPP-compliant mother interleaver
Interleaver block sizes from 40 to 5,114 bits
101
© 2000 Altera Corporation
Turbo Encoder/Decoder Features (cont..)
Block size can change between each block
Soft values (logarithmic likelihood) from 3 to 8 bits
Optional two memory banks for maximum throughput
Optimized for the Altera® APEX™ 20K and APEX 20KE architectures
MegaWizard™ Plug-In for easy parameterization
102
© 2000 Altera Corporation
Channel
Encoder 1 Encoder 2
Interleaver
Puncture
Turbo Coding System Level Description
De-Puncture Interleaver De-Interleaver
Decoder 1
Decoder 2
Turbo Encoder Turbo Decoder
103
© 2000 Altera Corporation
System Level Turbo Encoding
Block based coding– 3GPP block sizes 40 to 5114
bits
Two interleaved convolutional encoders generate parity streams P1 and P2
Parity streams can be punctured– 3GPP code rates:
• Rate 1/3 (no puncturing)
• Rate 1/2
Encoder 1 Encoder 2
Interleaver
Puncture
Turbo Encoder
104
© 2000 Altera Corporation
System Level Turbo Decoding
Iterates between decoders– Typically 3 to 8 iterations
Decoder 1 uses P1 stream from encoder 1
Decoder 2 uses P2 stream from encoder 2
Each decoder issoft-in / soft-out (SISO)
De-Puncture Interleaver De-Interleaver
Decoder 1
Decoder 2
Turbo Decoder
105
© 2000 Altera Corporation
Turbo Decoder Block Diagram
InformationMemory
AprioriMemory
ParityMemory
AlphaMemory
max-logMAPDecoder
3GPPInterleaver
ControlProcessor
106
© 2000 Altera Corporation
Turbo Decoder Operation
Step 1: Processor initialises turbo decoder core– Block size
– Number of iterations
Step 2: Processor writes information and parity likelihood values into appropriate memory
Step 3: Turbo decoder processes data
Step 4: Processor reads corrected information likelihood values from information memory
Step 5: Repeat from 2 (or from 1 if block size is different)
107
© 2000 Altera Corporation
max-logMAP Decoder
Turbo decoder requires soft-in / soft-out decoder Candidate algorithms are
– Soft Output Viterbi Algorithm (SOVA)– Maximal A-Posteriori (MAP)
MAP has superior performance– Works with 0.8dB to 1.3dB lower signal to noise ration (SNR)
MAP is larger (requires large alpha matrix memory) logMAP is MAP in logarithmic domain (avoid and ) max-logMAP is simplification of logMAP
– Approximately 0.3dB poorer performance– Difference reduces with mis-matched channel estimation
108
© 2000 Altera Corporation
3GPP Interleaver
Turbo internal interleaver is block interleaver– re-orders data within a block
Interleaver implemented as address sequence generator– Read or write memories in interleaved or non-interleaved order– Data stored in memories in non-interleaved order
3GPP interleaver block sizes vary from 40 to 5114 bits– All block sizes have different interleaver
Most hardware implementations are look-up tables– Require large memory: 5k 13– Require processor to generate memory data
Altera has developed algorithmic 3GPP interleaver
109
© 2000 Altera Corporation
Turbo MegaWizard
Bit width for soft decision values (3 to 8)
Pipeline delay for external memory accesses (2 to 6)• separate values for parity and alpha memories
Dual information memory banks•Increase throughput•Unload data from previous block and load data for next block while decoding current block•User needs to implement dual parity memory banks
110
© 2000 Altera Corporation
Turbo Decoder Performance
max-logMAP decoder requires2 clock cycles per information bit
Each iteration requires 4 clock cycles per bit(decoder 1 + decoder 2)
Five iterations requires 20 clock cycles per bit(plus ~ 5 cycles per bit for loading and unloading data)
max-logMAP runs at ~ 50 MHz in 20KE-1 50 MHz at 25 cycles per bit gives
111
© 2000 Altera Corporation
Logic and Memory Requirements
Altera turbo decoder requires 5000 to 7500 LEs Size of memories (N is number of soft bits)
112
© 2000 Altera Corporation
Example Turbo Decoder Configurations
113
© 2000 Altera Corporation
Turbo Decoder Bit Error Rate
Example results from C model simulations
114
© 2000 Altera Corporation
FIR Filter Compiler
115
© 2000 Altera Corporation
Features
High-Performance FIR Filter Synthesizer– Fully integrated finite impulse response (FIR) filter development
environment
Optimized for the APEX 20K and FLEX 10K device architectures
Fully parallel or serial arithmetic architectures Uses any Number of Taps Built-in Coefficient Generator Imports Coefficients from third-party tools
– Floating-point or integer
Uses multiple coefficient scaling algorithms Provides floating-point to fixed-point coefficient analysis
116
© 2000 Altera Corporation
Features (Continued)
Supports coefficient widths from 4 to 32 bits of precision Supports signed or unsigned input data widths, from 4 to
32 bits wide User-selectable output precision via rounding and
saturation Automatic Recognition of Symmetrical,
Non-Symmetrical and Anti-Symmetrical Filters– Automatic interpolation and decimation filters
Provides resource estimates dynamically Creates MATLAB, Simulink, VHDL, and Verilog HDL
simulation models Includes an impulse, step function, and random input
testbed
117
© 2000 Altera Corporation
Complete Design Environment
DSP System-Level Tools
System specification
Architecture explorationFIR coefficient
PLD Tools
HDL entry HDL simulation Synthesize MAP
place-and-route
FIR CompilerMegaWizard Tools
Parameterize function
Shorter design times
118
© 2000 Altera Corporation
Complete Design Environment
Built-in coefficient generation or import coefficients from
Third-Party Tools (MATLAB®)
Floating point to fixed point conversion plus analysis
Models and testbeds for both Verilog HDL and VHDL
MATLAB and Simulink® Integration
Reference designs supplied with the compiler
Resource estimator allows user to interactively trade-off
area/speed without compiling FLEX® 10K and APEXTM 20K Family
* MATLAB and Simulink are registered trademarks of The MathWorks, Inc.
119
© 2000 Altera Corporation
FIR Specification Flow
Simulation Output Format
Input / Output Data Bus Width, Type
PLD Architecture Serial, Parallel
Size Estimator
Coefficient Values Number of Taps,Precision
Output Data Bus Full /Rounded/ Saturated
Precision
Easy Access to all FIR Parameter using MegaWizard plug-in
MegaWizard GUI has the same Look and Feel like other IP Cores from Altera
Step 1
Step 2
Step 3
Step 4
Step 5
120
© 2000 Altera Corporation
Step 1: Input Bus Specification
Input Format– Signed / Unsigned – Any data width
121
© 2000 Altera Corporation
Step 2: Output Precision Specification
Output Format – Full precision– Saturation– Rounding
122
© 2000 Altera Corporation
Step 3: Coefficient Specification
123
© 2000 Altera Corporation
Step 4: Coefficients Scaling
Floating-to Fixed-point conversion Apply scaling Factors
124
© 2000 Altera Corporation
Step 5: Scaling Analysis
Observe scaling effects Observe fixed vs. floating point precision
125
© 2000 Altera Corporation
Step 6: Architecture Setting
Serial / parallel architecture trade-offs
Dynamic size estimator based on parameters entry
Multi-channel support for I/Q modulation
126
© 2000 Altera Corporation
Step 7: Verification Output Files
Multiple verification output format:– HDL: VHDL and verilog HDL– MATLAB: M-File and Simulink model
127
© 2000 Altera Corporation
MATLAB - Simulink Interface
FIR compiler generates Simulink models based on parameters entry
Speeds up DSP system-level analysis
128
© 2000 Altera Corporation
NCO Compiler
129
© 2000 Altera Corporation
Digital NCO Compiler
An NCO May Be Used To Generate a Carrier or Modulate a Signal Onto a Carrier
Carrier Generation
FrequencyModulation
130
© 2000 Altera Corporation
NCO Compiler Features
Supports Multiple Architectures– ROM Based (EAB/ESB or External ROM)
• Large ROM - fast• Small ROM - memory efficient
– CORDIC Based (Logic or EAB/ESB based) • Recommended for high angular precision• Smallest ROM required
Implementation with FLEX 6000, FLEX 10K, APEX 20K series
131
© 2000 Altera Corporation
Features (Continued)
Single or Dual Outputs (Sine/Cosine) Variable width Frequency Modulation input User defined Angular Precision in the Phase
Accumulator User defined Angular Precision in the Sine Wave
Generator User defined Magnitude Precision Automatic ROM file Generation Two’s Complement Signed Numbers
132
© 2000 Altera Corporation
Dynamically Generated Matlab Interface
Stimulus / Testbed– Standard Inputs
• Impulse, Step, Random Inputs– Time and Frequency Domain Displays– Text Output
Models– Standard M-files and Simulink S-functions– Clock Cycle and Bit Accurate Models
133
© 2000 Altera Corporation
Dynamically Generated Verilog
Standard Stimulus– Impulse– Step– Pseudo-random Data
Output– $Dumpvars - Use Any Waveform Viewer– ASCII Text - Conversion to Signed Number
Model– Verilog Model of NCO – Clock Cycle and Bit Accurate Matlab Models
134
© 2000 Altera Corporation
NCO Compiler Function
SIN/COS Waveform Generator
Z-1
Frequency Offset
Frequency Set
SIN Out
COS Out
User Configurable Parameters– CORDIC or ROM Implementation– Frequency Resolution– Angular Resolution– Magnitude Resolution
135
© 2000 Altera Corporation
NCO Compiler Implementation
Set Angular and Magnitude Precision
Select type of Sinusoid Generation
Output Selection Modulation Input
Specification
136
© 2000 Altera Corporation
NCO Compiler Implementation
Select Simulation Output models
Output frequency specification
137
© 2000 Altera Corporation
Included Reference Designs
QPSK Modulator All Digital Phase Lock Loop Direct Digital Synthesizer
Users can perform Matlab, Simulink, Verilog, MAX+Plus II and Quartus Simulations
138
© 2000 Altera Corporation
QPSK Modulator: Reference Design
139
© 2000 Altera Corporation
Direct Digital Synthesizer: Ref. Design
140
© 2000 Altera Corporation
All Digital PLL: Reference Design
141
© 2000 Altera Corporation
Design Techniques using LPM (Library of Parameterized Modules)
The Basic Building Blocks for DSP Functions
142
© 2000 Altera Corporation
LPM Features: Basic Building Blocks
Parameterized Modules– Users specify the parameters to set the features and
functionality– VHDL, Verilog HDL, AHDL, Schematic, EDIF Input Files
Efficient Design Mapping– Altera provides optimum solutions: Guaranteed!– Architecture independent designs without sacrificing efficiency
Specification of a Complete Design– Use LPM functions as basic building blocks to develop complex
designs
Architecture Independent Design Entry– Supported by the leading EDA tool vendors LPM Building blocks are fully Optimized
for Altera Devices
143
© 2000 Altera Corporation
LPM Support in Quartus and MAX+PLUS II Altera supports the following Industry Standard LPM
Functions Arithmetic Components:
– LPM_ABS– LPM_ADD_SUB– LPM_COMPARE– LPM_COUNTER– LPM_MULT
Storage Components– LPM_DFF– LPM_FF– LPM_LATCH– LPM_RAM_DQ– LPM_RAM_IO– LPM_ROM– LPM_SHIFTREG– LPM_TFF
Gates:– LPM_AND– LPM_BUSTRI– LPM_CLSHIFT– LPM_CONSTANT– LPM_DECODE– LPM_INV– LPM_MUX– LPM_OR– LPM_XOR
144
© 2000 Altera Corporation
Fundamental DSP Building Blocks
Add: LPM_ADD_SUB Subtract: LPM_ADD_SUB Multiply: LPM_MULT Divide: LPM_DIVIDE Compare: LPM_COMPARE Shift Register: LPM_SHIFTREG XOR: LPM_XOR Storage: LPM_RAM, ROMs, FIFO, etc.
145
© 2000 Altera Corporation
LPM_ADD_SUB: Parameterized Adder/Subtractor
LPM_DIRECTION=
LPM_PIPELINE=
LPM_REPRESENTATION=
LPM_WIDTH=
ONE_INPUT_IS_CONSTANT=
Name Required Value DescriptionLPM_DIRECTION No "ADD", "SUB",
"DEFAULT"LPM_PIPELINE No Integer >=0 The default value is 0 (non-pipelined).
Specifies the number of clock cycles of latency associated with the result[ ] output
LPM_REPRESENTATION No "SIGNED", "UNSIGNED"
Default value is "SIGNED"
LPM_WIDTH Yes Integer >0ONE_INPUT_IS_CONSTANT No "YES", "NO" Provides greater optimization if an input is
constant.
Parameters
add_sub
cin
DataA[ ]
Clock
DataB[ ]
aclr
result[ ]
Overflow
Cout
146
© 2000 Altera Corporation
LPM_ADD_SUB: Pipeline Parameter
Pipelining in the Megawizard GUI does not equal adding a pipeline after the LPM_ADD_SUB function.
Trades off Area vs. Performance.– LPM_ADD_SUB with external register use less resources but
slower.• Must have FAST synthesis set for register to be merged into LE
– LPM_ADD_SUB with a single pipeline will use more resources but faster (* beware of the bypass path for tco)
Design Tip
147
© 2000 Altera Corporation
Adder1,2.gdf
10K30E-1, 49 LEs, 169 MHz
10K30E-1, 59 LEs, 222 MHz
148
© 2000 Altera Corporation
Using LPM_ADD_SUB: Signed Numbers
MegaWizard Assumes Signed Numbers Sign Extend Data Width and Ignore Carry
– Example: Adding Two 8 Bit Signed Buses
• Sign extend to 9 Bit and use a 9-bit “SIGNED” LPM_ADD_SUB
• Use the 9 bit result and ignore the carry out
– Example: Adding Two 8 Bit Unsigned Buses
• Set the 9th Bit to zero and use a 9-bit “SIGNED” LPM_ADD_SUB
• Use the 9 bit result and ignore the carry out
Use LPM_ADD_SUB to access “UNSIGNED” Numbers. ADD_SUB port will increase the design size and reduce
speed
If possible use ADD functions with sign extension
149
© 2000 Altera Corporation
Adder3.gdf
Using an Unsigned LPM
150
© 2000 Altera Corporation
Adder5.gdf
Using the MegaWizard
Sign Extension
151
© 2000 Altera Corporation
LPM_MULT: Parameterized MultiplierINPUT_A_IS_CONSTANT=
INPUT_B_IS_CONSTANT=
LPM_PIPELINE=
LPM_REPRESENTATION=
LPM_WIDTHA=
LPMWIDTHB=
LPM_WIDTHP=(LPM_WIDTHA+LPMWIDTHB)
LPM_WIDTHS+LPM_WIDTHA
Clock
DataA[ ]
Sum[ ]
DataB[ ]
aclr
Result[ ]
Allows two signed or unsigned numbers to be multiplied
The result of the multiplication can be added to a third number
Name Required Value DescriptionINPUT_A_IS_CONSTANT No "YES", "NO" If DataA[ ] is connected to a constant value,
setting the value to "YES" optimizes the multiplier for resource usage and speed. The default value is "No"
INPUT_B_IS_CONSTANT No "YES", "NO" If DataB[ ] is connected to a constant value, setting the value to "YES" optimizes the multiplier for resource usage and speed. The default value is "No"
LPM_PIPELINE No Integer >=0 The default value is 0 (non-pipelined). Specifies the number of clock cycles of latency associated with the result[ ] output
LPM_REPRESENTATION No "SIGNED", "UNSIGNED"
Type of Multiplication. Default is "UNSIGNED"
LPM_WIDTHA Yes Integer >0LPM_WIDTHB Yes Integer >0LPM_WIDTHP No Integer >0 Width of the result[ ] port. Default is
LPM_WIDTHA+LPM_WIDTHBLPM_WIDTHS No Integer >=0 Same as LPM_PIPELINE
Parameters
152
© 2000 Altera Corporation
Using LPM_MULT
Use Pipelining Feature for Speed!– It may result in no additional resources: “FREE”
Use Area vs Speed Tradeoff Switch in the MegaWizard GUI – Change default setting to Area to get smaller result when using
signed numbers
Design Tip
153
© 2000 Altera Corporation
mult2.gdf
Signed / Unsigned Optimized Pipeline LEs Fmax
Unsigned Speed 3 226 113Signed Speed 3 228 114
Unsigned Area 3 226 113Signed Area 3 183 175
10K30E-1
154
© 2000 Altera Corporation
mult4.gdf
226 LEs, 113 MHz
204 LEs, 166 MHz
10K30E-1
155
© 2000 Altera Corporation
LPM_SHIFTREG: Parameterized Shift Register
LPM_AVALUE=
LPM_DIRECTION=
LPM_SVALUE=
LPM_WIDTH=
sclr
sset
shiftin
load
Data[ ]
clock
enable
shiftout
Q[ ]
ac
lr
as
et
Name Required Value DescriptionLPM_AVALUE No Integer > 0 Constant value that is loaded when aset is
highLPM_DIRECTION No "LEFT",
"RIGHT"Direction of the shift register. The default value is "LEFT"
LPM_SVALUE No Integer >=0 Constant value that is loaded on the rising edge of clock when sset is high. The default value is all 1s.
LPM_WIDTH Yes Integer >= 0 Width of the data [ ] and q [ ] ports.
Parameters
156
© 2000 Altera Corporation
Using LPM_SHIFTREG for 3G blocks
LPM_SHIFTREG needed for number of different 3G blocks e.g. PN Generators (several needed for despreading) and Convolutional Encoder
Option for parallel loading of initial values without any resource penalty (required to load new values every 10ms)
Implementation flexibilities - can implement in LE or in ESBs
Can access any number of intermediate bits to implement characteristic polynomial of a
157
© 2000 Altera Corporation
FIR Filter Implementation using LPMs
158
© 2000 Altera Corporation
FIR Characteristics
Output Is Determined Solely From Previous Inputs
Y = C0X0 + C1X1 + C2X2 + ….CN = Coefficients
XN = Delayed Data
Result is an Inherently Stable Filter– Filter Will Never Go Unstable and Oscillate
Constant Delay Through the System More Computation Intensive Than if Output is Based on
Previous Output (Recursive or IIR Filter)
© 2000 Altera Corporation
General FIR Filter Equation
A general 8-tap FIR filter has 8 coefficients. – For a general 8-tap filter, the coefficients are:
• h(1), h(2), h(3), h(4), h(5), h(6), h(7), h(8)
– The equation for the output y(n) is•
• or in compact form:
y n x n h x n h x n h x n h
x n h x n h x n h x n h
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
1 2 3 4
5 6 7 8
1 2 3
4 5 6 7
y n h n x n ii
N
0
© 2000 Altera Corporation
General FIR Filter Structure - 8 Taps
x(n+1)
Yt
h1 h4h3h2
D Q D Q D Q D Q
s1s3s2
m+w
X(n) X(n-1) X(n-2) X(n-3)
h5 h8h7h6
D Q D Q D Q D Q
s5s7s6
X(n-4) X(n-5) X(n-6) X(n-7)
w
ms8s4
Parallel FIR Shown
m+w+1
m+w+2
m+w+3
© 2000 Altera Corporation
8-Tap Parallel FIR Filter: Non Symmetrical
x(n+1)
Yt
h1 h4h3h2
D Q D Q D Q D Q
s1s3s2
m+w
X(n) X(n-1) X(n-2) X(n-3)
h5 h8h7h6
D Q D Q D Q D Q
s5s7s6
X(n-4) X(n-5) X(n-6) X(n-7)
w
ms8s4
162
© 2000 Altera Corporation
FLEX / APEX FIR Implementation
FIR Equation :– Y = C0X0 + C1X1 + C2X2 + ….
FLEX / APEX Implementation– Taps Delay Line
• Unit delay element implemented in LUT or EAB/ESB– Partial Product (LUT)
• Multiplication by a constant implemented in LUT– Adder Tree
• Partial Product Sum using a balanced Adder Tree– Filter can be implemented using parallel techniques for speed or
serial techniques for area
© 2000 Altera Corporation
FLEX / APEX - LUT Multiplication Benefits LUT Operation: “Any Function of 4 Inputs Can Be Done” Pre-computed Values Loaded in LUT No Need for Full Multiplier
LE OutLook-up
Table[LUT]
CascadeChain
PRND QCLK
CLRN
DATA1
DATA2
DATA3
DATA4
Cascade In
CascadeOut
Load Pre-ComputedFilterCoefficients
Cascade&
CarryChain
Carry In
CarryOut
1 0 1 00 0 1 11 1 0 01 1 1 0
© 2000 Altera Corporation
hn = 01 11 10 11
x sn = 11 00 10 01
p0 = 01 00 00 11 = 100
p1 = 01 00 10 00 = 011
yn = 011 000 100 011 = 1010
LUT-Based Vector Multiplier: Example
hn Values Are Constant
Partial Products Can Be Added Horizontally or Vertically All hn Are Constant, so Only 16 Values Are Possible for
Each pn
– p0 Is Determined by 4 sn LSBs (Bold), which Can Be Used as LUT Inputs
© 2000 Altera Corporation
si0 p0
1000 => 01+00+00+00 = 00101001 => 01+00+00+11 = 01001010 => 01+00+10+00 = 00111011 => 01+00+10+11 = 01101100 => 01+11+00+00 = 01001101 => 01+11+00+11 = 01111110 => 01+11+10+00 = 01101111 => 01+11+10+11 = 1001
LUT Contains the 16 Possible Values of p0
Computing First Partial Product
si0 p0 -- LUT Value
0000 => 00+00+00+00 = 00000001 => 00+00+00+11 = 00110010 => 00+00+10+00 = 00100011 => 00+00+10+11 = 01010100 => 00+11+00+00 = 00110101 => 00+11+00+11 = 01100110 => 00+11+10+00 = 01010111 => 00+11+10+11 = 1000
Note: Superscripts Represent Bit Position
p0 = (h1 x s10) + (h2 x s2
0) + (h3 x s30) + (h4 x s4
0) = (01 x s1
0) + (11 x s20) + (10 x s3
0) + (11 x s40)
© 2000 Altera Corporation
4 Taps = Fundamental Building Block
h1 h4h3h2
D Q D Q D Q D Q
s1s3s2
m+ws4
m+w+1
D Q D Q D Q D Q
LOOK UP TABLE (LUT)
Yt
m+w+2
Yt
© 2000 Altera Corporation
4-Input, 2-Bit Parallel Vector Multiplier
LUT(0) & LUT(1)Are Identical
Each LUT = 4Separate FLEX16x1 LUTs
Partial Products
Partial Sums
LUT(1)A1 A2 A3 A4
LUT(0)A1 A2 A3 A4
4 4
P(0)P(1)
FIR FilterResponse
5
Shift Left for Magnitude
1
S(3) S(4)
S(4)0
S(4)1
S(2)
S(2)0
S(2)1
S(3)0
S(3)1
S(1)
S(1)0
S(1)1
2 2 2 2
© 2000 Altera Corporation
4 Input, 8-Bit Parallel Vector Multiplier
16X8 LUT 16X8LUT 16X8 LUT 16X8 LUT 16X8 LUT 16X8 LUT16X8 LUT 16X8 LUT
8 8 8 8 8 8 8 8
s1 s2 s3 s 4
2
2
2
2
4
4
16
p0p4 p3 p2 p1p7 p6 p5
Yt
10101010
12 12
16
© 2000 Altera Corporation
Increasing Performance
s1 s2 s3 s4
16X8 LUT 16X8LUT 16X8 LUT 16X8 LUT 16X8 LUT 16X8 LUT16X8 LUT 16X8 LUT8 8 8 8 8 8 8 8
2
2
2 2
44
16
p0p4 p3 p2 p1p7 p6 p5
Yt
10101010
12 12
16
Adding Registers Introduces Pipelining
D Q
D Q
D Q
D Q
© 2000 Altera Corporation
Parallel FIR Filter: Non Symmetrical
x(n+1)
Yt
h1 h4h3h2
D Q D Q D Q D Q
s1s3s2
m+w
X(n) X(n-1) X(n-2) X(n-3)
h5 h8h7h6
D Q D Q D Q D Q
s5s7s6
X(n-4) X(n-5) X(n-6) X(n-7)
w
ms8s4
© 2000 Altera Corporation
Utilizing Symmetry in the Coefficients
All linear phase FIR filters have symmetric coefficients– For a linear phase 8-tap filter the coefficients are symmetric
h(1) = h(8), h(2) = h(7), h(3) = h(6), h(4) = h(5)– Therefore the equation becomes:
•
– Which can be factored to be
y n x n h x n h x n h x n h
x n h x n h x n h x n h
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
1 2 3 4
4 3 2 1
1 2 3
4 5 6 7
y n h x n x n
h x n x n
h x n x n
h x n x n
( ) [ ( ) ( )]
[ ( ) ( )]
[ ( ) ( )]
[ ( ) ( )]
1
2
3
4
7
1 6
3 5
4 5
© 2000 Altera Corporation
Parallel FIR Filter: Symmetrical
h1 h4h3h2
Yt
x(n+1)D Q D Q D Q D Q
7
DQDQDQDQ
s1
s4s3s2
8 8 8 8
Vector Multiplier
X(n) X(n-1) X(n-2) X(n-3)
X(n-4)X(n-5)X(n-6)X(n-7)
s x n x n
s x n x n
s x n x n
s x n x n
1
2
3
1
7
1 6
2 5
3 4
( ) ( )
( ) ( )
( ) ( )
( ) ( )
Even Symmetry Shown
Symmetric taps are added before the multiplication
© 2000 Altera Corporation
Serial FIR Filter
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
n x 1ShiftReg
x(n)
Parallel-to-SerialShift Register
plsr_load
clk
SUM CLR
SerialAdder
SUM CLR
SerialAdder
SUM CLR
SerialAdder
CINCOUT
D Q
CLRclk
Serial Adder
16 x 8LUT
8
Scaling Accumulator
y(n)
ResultRegister
ControlBlock
accum_clr
add_sub
latch_result
carry_clear
plsr_load
clk
= fSAMPLE
clk = (n+1) * fSAMPLE
D Q
D Q
D Q
n x 1 Shift Reg
174
© 2000 Altera Corporation
Ways To Reduce The Computation
Symmetrical Coefficients– Can Add Or Subtract Taps Values Prior To Multiplication– Conceptually Saves Number Of Multipliers
Coefficient = 0– Altera will eliminate
Coefficient Values Near Zero– Coefficients become zero or only a few bits are needed to
compute
175
© 2000 Altera Corporation
Altera Products Update
176
© 2000 Altera Corporation
I/O
Usable Gates
FLEX 10KFLEX 10KFLEX 6000FLEX 6000FLEX 8000FLEX 8000
APEX 20K®
Component Update
MAX 9000MAX 9000MAX 7000MAX 7000MAX 3000MAX 3000
®
ACEX 1KACEX 2K
177
© 2000 Altera Corporation
APEX Architecture
LUTLUT
Product TermProduct Term
MemoryMemory
MultiCore™ Architecture
ESBESB
Embedded System Block (ESB)
Product TermProduct Term
Dual-Port RAMDual-Port RAM
ROMROM
CAMCAM
I/O Features
LVDSMultiVolt I/OGTL+CTT
AGPSSTL-3/-2HSTL
Clock Management
Up to 4 PLLsClockShift™ CircuitryClockBoost™ CircuitryClockLock™ Circuitry
178
© 2000 Altera Corporation
APEX 20KE Features
1.8-V, 0.18/0.15-µm, 6LM SRAM Process
100K to 1.5M Gates– 2,560 to 54,720 Logic Elements
– 33,000 to 467,000 Bits of RAM
– 256 to 3,648 Macrocells
MultiCore™ Embedded Architecture– Product Term with 3.9-ns
Performance
– High-Speed Dual-Port RAM
– Content Addressable Memory (CAM)
Advanced I/O Standard Support – LVTTL, LVCMOS, SSTL3,
SSTL2, GTL+, LVDS, AGP, CTT, HSTL
Advanced PLL Features– Up to 4 PLLs
– m/nxk frequency synthesis
– Programmable delay adjustment
– Programmable phase shift
MultiVolt™ I/O Interface Advanced FineLine BGA™
Packaging
179
© 2000 Altera Corporation
APEX 20KE Availability
Devices
EP20K30E
EP20K60E
EP20K100E
EP20K160E
EP20K200E
EP20K300E
EP20K400E
EP20K600E
EP20K1000E
EP20K1500E
Now
Now
Now
Now
Now
Now
Now
Now
Availability
180
© 2000 Altera Corporation
ACEX: High Performance at Low Cost
0.22-/0.18-micron Hybrid Process 2.5-V Core and 5-V Tolerant I/Os Up to 4,992 Logic Elements (up to
100K Gates) Up to 49Kb of Dual-Port RAM 64-Bit, 66-MHz PCI Compliant PLL Support
0.18-micron 6LM Process 1.8-V Core Up to 150K Usable Gates Dual-Port RAM Blocks High-Performance PLL Advanced I/O Standard
ACEX 1K ACEX 2K
181
© 2000 Altera Corporation
ACEX 1K Availability
Device Availability
EP1K100 Now
EP1K50
EP1K30
EP1K10
Now
Now
182
© 2000 Altera Corporation
183
© 2000 Altera Corporation
MAX 7000B Overview
2.5-V In-System Programmability (ISP) 3.5-ns tPD Performance
Support for Advanced I/O Standards– GTL+, SSTL-2, SSTL-3
Bus Hold User Mode I/O Pull-ups 0.22-Micron, Four-Layer-Metal Process Technology Function- and Pin-Compatible with Industry-Standard
MAX 7000A and MAX 7000S Devices Ultra FineLine BGA and FineLine BGA Packages
184
© 2000 Altera Corporation
MAX 7000B Device Features
Feature
Usable Gates
Macrocells
Max. User I/O Pins
tPD (ns)
fCNT (MHz)
EPM7256B
5,000
256
164
5.0
178.6
EPM7512B
10,000
512
212
6.0
147.1
EPM7032B
600
32
36
3.5
200
EPM7064B
1,250
64
68
3.5
200
EPM7128B
2,500
128
100
4.5
192
185
© 2000 Altera Corporation
MAX 7000B Availability
Industry Standard 2.5-V Product Term Family
Device Availability
EPM7032B
EPM7064B
EPM7128B Now
EPM7256B
EPM7512B
Now
Now
186
© 2000 Altera Corporation
Intellectual Property Core Products Update
www.altera.com/IPmegastore
187
© 2000 Altera Corporation
Bus Interface IP Available
PCI PCI-X Utopia II Utopia III POS-PHY II
CAN Bus IIC IEEE 1394 USB IEEE 1284 Parallel
188
© 2000 Altera Corporation
Communications IP Available
CRC Reed-Solomon De/Interleaver Viterbi Turbo u-Law & A-Law ADPCM Codec 10/100 Ethernet MAC Gigabit Ethernet MAC
ATM Cell Deliniation ATM Over SONET SONET Processor Packet Over SONET HDLC Controller Tone Generation Inverse Multiplexing for
ATM Tx/Ex Framers 8b/10b Codec
189
© 2000 Altera Corporation
Signal-Processing IP Available
Color Space Converter FFT Compiler FIR Filter Compiler Equalizers ARCTAN CORDIC NCO Convolutional Encoder
DES Encryption DCT IIR Filters Laplacian Edge Detector Rank Order Filter Binary Pattern Correlator Digital Modulator Echo Canceller
190
© 2000 Altera Corporation
Processor IP Available
Lexra 32-Bit, R3000-Class Processor Tensilica Xtensa Processor ARC Configurable Processor 8051/8052 6502 Z80 29116A
191
© 2000 Altera Corporation
Peripheral IP Available
UARTs– 16450, 16550, 6402,
6850, 8251, 8255 SDRAM Controllers DDR DRAM Controllers DMA Controllers
– 8237, Programmable
8255 Peripheral Adapter PowerPC Bus Interface 8259 Interrupt Controller Programmable Memory
Controller
192
© 2000 Altera Corporation
Development Tools Update
193
© 2000 Altera Corporation
Development Tools Update
MAX+PLUS II Version 9.6 Released– Significantly improved Push Button Results (Fmax) and Compile
times from version 9.5 and up– Includes ACEX 1K Support– OEM Synthesis Tools - World Class Synthesis
Quartus 2000.03 Released– Supports all new APEX E devices– OEM Synthesis Tools - World Class Synthesis
– 40% fMAX Improvement
– Significantly Better I/O Timing
194
© 2000 Altera Corporation
OEM Partnership
All Included in Altera Subscription Products at NO Additional Cost
Synthesis Tools Importance
Exemplar LeonardoSpectrum
Synopsys FPGA Express
World Class SynthesisUNIX & PC Platform Support
World-Class SynthesisPC Platform Support
Simulation Tool Importance
Model Technology ModelSim World-Class Behavioral SimulationTest Bench CapabilityUNIX & PC Platform Support
195
© 2000 Altera Corporation
Entry-Level Tools: Free Web Download
FLEX 6000ACEX
AHDLLeonardoSpectrumFPGA Express
YES
YES - Native
YES
YES
MAX+PLUS IIMAX+PLUS IIBASELINEBASELINE
Device Support
Synthesis
Schematic Entry
TimingSimulation
Floorplan Editor
LPM & OpenCore
MAX 7000MAX 3000
AHDL LeonardoSpectrumFPGA Express
YES
YES - Native
YES
YES
MAX+PLUS IIMAX+PLUS IIE+MAXE+MAX
Free WEB Download !!!
FeatureFeature
196
© 2000 Altera Corporation
Real-Time Hardware Verification Method
197
© 2000 Altera Corporation
Verification Options Are Restricted
Many Complex Designs Cannot Be Simulated with True System Stimulus– Communications– Multimedia– DSP
Board Verification Problematic with New BGA Packaging
198
© 2000 Altera Corporation
User Defines Signals & Trigger Points to Capture Data
SignalTap Megafunction Inserted into Design
Data Stored in APEX EABs
Streamed out to Quartus through Download Cable
APEX 20K
Download Cable
Built-inLogic Analyzer
First and Only Tool of Its Kind!
SignalTap Logic Analysis Solution
199
© 2000 Altera Corporation
APEX 20K
Analyze Board Signals as Well asInternal Signals
MasterBlaster™Plus
SignalTap Plus Capability
200
© 2000 Altera Corporation
Signal Tap Plus System Analyzer
JT
AGSignalTap
PlusSystem Analyzer
System-CentricDebug Solution
Internal Chip-Level
Activity
ExternalBoard-Level
Activity
201
© 2000 Altera Corporation
Development Boards
202
© 2000 Altera Corporation
Available Now
FLEX PCI Development Kit
64-Bit/66-MHz PCI– EPF10K100E-1 FBGA484
Works Out of Box– Reference Design– Windows Driver– Software Program
Expansion Cards Development & Prototyping
Platform
203
© 2000 Altera Corporation
SOPC Development Board
Desk-Top Board– EP20K400EBC652
Features– DRAM, Flash Memory,
Ethernet, USB, Firewire, LCD, Keyboard, Serial …
Same Expansion Cards as PCI
Available Now
204
© 2000 Altera Corporation
Summary