![Page 1: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/1.jpg)
Techniques for Low Power Turbo Coding in Software Radio
Joe AntoonAdam Barnett
![Page 2: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/2.jpg)
Software Defined Radio
• Single transmitter for many protocols• Protocols completely specified in memory• Implementation:
– Microprocessors– Field programmable logic
![Page 3: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/3.jpg)
Why Use Software Radio?
• Wireless protocols are constantly reinvented– 5 Wi-Fi protocols– 7 Bluetooth protocols– Proprietary mice and keyboard protocols– Mobile phone protocol alphabet soup
• Custom DSP logic for each protocol is costly
![Page 4: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/4.jpg)
So Why Not Use Software Radio?
• Requires high performance processors• Consumes more power
Inefficient general fork
Efficient applicationspecific fork
Inefficient Field-programmable
fork
![Page 5: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/5.jpg)
Turbo Coding
• Channel coding technique• Throughput nears theoretical limit• Great for bandwidth limited applications
– CDMA2000– WiMAX – NASA ‘s Messenger probe
![Page 6: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/6.jpg)
Turbo Coding Considerations
• Presents a design trade-off• Turbo coding is computationally expensive• But it reduces cost in other areas
– Bandwidth– Transmission power
![Page 7: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/7.jpg)
Reducing Power in Turbo Decoders
• FPGA turbo decoders– Use dynamic reconfiguration
• General processor turbo decoders– Use a logarithmic number system
![Page 8: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/8.jpg)
Generic Turbo Encoder
ComponentEncoder
ComponentEncoderInterleave
p1
s
p2
Data stream
![Page 9: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/9.jpg)
q1
r
q2
Generic Turbo Decoder
Decoder DecoderInterleave
![Page 10: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/10.jpg)
Decoder Design Options• Multiple algorithms used to
decode• Maximum A-Posteriori (MAP)
– Most accurate estimate possible– Complex computations required
• Soft-Output Viterbi Algorithm– Less accurate– Simpler calculations
Decoder
![Page 11: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/11.jpg)
FPGA Design Options
• Goal Make an adaptive decoder
DecoderReceived Data
Parity
Original sequence
Tunable Parameter
Lowpower,
accuracy
Highpower,
accuracy
![Page 12: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/12.jpg)
Component Encoder
• M blocks are 1-bit registers• Memory provides encoder state
M MGeneratorFunction
![Page 13: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/13.jpg)
Encoder State
00
Time
01
10
11
00
01
10
11
0
1GF
000
01
10
11
1
0
1
![Page 14: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/14.jpg)
Viterbi’s Algorithm
• Determine most likely output• Simulate encoder state given received values
s0 s1 s2
r0 p0 r1 p1 r2 p2
d0 d1 d2
…
Time
![Page 15: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/15.jpg)
Viterbi’s Algorithm
• Write: Compute branch metric (likelihood)• Traceback: Compute path metric, output data• Update: Compute distance between paths
• Rank paths by path metric and choose best• For N memory:
– Must calculate 2N-1 paths for each state
![Page 16: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/16.jpg)
Adaptive SOVA
• SOVA: Inflexible path system scales poorly• Adaptive SOVA: Heuristic
– Limit to M paths max– Discard if path metric below threshold T– Discard all but top M paths when too many paths
![Page 17: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/17.jpg)
Implementing in Hardware
Branch Metric
Unit
AddCompare
Select
Survivor memory
Control
q
r
![Page 18: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/18.jpg)
Implementing in Hardware
• Controller – – Control memory– select paths
• Branch Metric Unit– Compute likelihood– Consider all possible
“next” states
• Add, Compare, Select – Append path metric– Discard paths
• Survivor Memory– Store / discard path bits
![Page 19: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/19.jpg)
Implementing in Hardware
Add, Compare, Select Unit
Present State Path
Values
Next State Path Values
Compute,Compare
Paths
BranchValues
> T
PathDistance
Threshold
![Page 20: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/20.jpg)
Dynamic Reconfiguration
• Bit Error Rate (BER)– Changes with signal strength– Changes with number of paths used
• Change hardware at runtime– Weak signal: use many paths, save accuracy– Strong signal: use few paths, save power– Sample SNR every 250k bits, reconfigure
![Page 21: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/21.jpg)
Dynamic Reconfiguration
![Page 22: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/22.jpg)
Experimental Results
K (Number of encoder bits) proportional to average speed, power
![Page 23: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/23.jpg)
Experimental Results
• FPGA decoding has a much higher throughput• Due to parallelism
![Page 24: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/24.jpg)
Experimental Results
• ASOVA performs worse than commercial cores• However, in other metrics it is much better
– Power– Memory usage– Complexity
![Page 25: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/25.jpg)
Future Work
• Use present reconfiguration means to design– Partial reconfiguration– Dynamic voltage scaling
• Compare to power efficient software methods
![Page 26: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/26.jpg)
Power-Efficient Implementation of a Turbo Decoder in SDR System
• Turbo coding systems are created by using one of three general processor types– Fixed Point (FXP)
• Cheapest, simplest to implement, fastest
– Floating Point (FLP)• More precision than fixed point
– Logarithmic Numbering System (LNS)• Simplifies complex operations• Complicates simple add/subtract operations
![Page 27: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/27.jpg)
Logarithmic Numbering System
• X = {s, x = log(b)[|x|]}– S = sign bit, remaining bits used for number value
• Example– Let b = 2,– Then the decimal number 8 would be represented
as log(2)[8] = 3– Numbers are stored in computer memory in 2’s
compliment form (3 = 01111101) (sign bit = 0)
![Page 28: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/28.jpg)
Why use Logarithmic System?
• Greatly simplifies multiplication, division, roots, and exponents– Multiplication simplifies to addition
• E.g. 8 * 4 = 32, LNS => 3 + 2 = 5 (2^5 = 32)
– Division simplifies to subtraction• E.g. 8 / 4 = 2, LNS => 3 – 2 = 1 (2^1 = 2)
![Page 29: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/29.jpg)
Why use Logarithmic System?
• Roots are done as right shifts– E.g. sqrt(16) = 4,
LNS => 4 shifted right = 2 (2^2 = 4)
• Exponents are done as left shifts– E.g. 8^2 = 64, LNS => 3 shifted left = 6 (2^6 = 64)
![Page 30: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/30.jpg)
So why not use LNS for all processors?
• Unfortunately addition and subtraction are greatly complicated in LNS.– Addition: log(b)[|x| + |y|] = x + log(b)[1 + b^z]– Subtraction: log(b)[|x| - |y|] = x + log(b)[1 - b^z]
• Where z = y – x
• Turbo coding/decoding is computationally intense, requiring more mults, divides, roots, and exps, than adds or subtracts
![Page 31: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/31.jpg)
Turbo Decoder block diagram
• Use present reconfiguration means to design– Partial reconfiguration– Dynamic voltage scaling
• Compare to power efficient software methods
• Each bit decision requires a subtraction, table look up, and addition
![Page 32: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/32.jpg)
Proposed new block diagram
• As difference between e^a and e^b becomes larger, error between value stored in lookup table vs. computation becomes negligible.
• For this simulation a difference of >5 was used
![Page 33: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/33.jpg)
How it works
• For d > 5• New Mux (on right) ignores SRAM input and simply
adds 0 to MAX result.• d > 5, pre-Decoder circuitry disables the SRAM for
power conservation.
![Page 34: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/34.jpg)
Comparing the 3 simulations
• Comparisons were done between a 16-bit fixed point microcontroller, a 16-bit floating point processor, and a 20-bit LNS processor.
• 11-bits would be sufficient for FXP and FLP, but 16-bit processors are much more common
• Similarly 17-bits would suffice for LNS processor, but 20-bit is common type
![Page 35: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/35.jpg)
Power Consumption
![Page 36: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/36.jpg)
Latency
• Recall: Max*(a,b) = ln(e^a+e^b)
![Page 37: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/37.jpg)
Power savings
• Pre-Decoder circuitry adds 11.4% power consumption compared to SRAM read.
• So when an SRAM read is required, we use 111.4% of the power compared to the unmodified system
• However, when SRAM is blocked we only use 11.4% of the power we used before.
![Page 38: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/38.jpg)
Power savings
• The CACTI simulations for the system reported that the Max* operation accounted for 40% of all operations in the decoder
• The Max* operations for the modified system required 69% of the power when compared to the unmodified system.
• This leads to an overall power savings of69% * 40% = 27.6%
![Page 39: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/39.jpg)
Conclusion
• Turbo codes are computationally intense, requiring more complex operations than simple ones
• LNS processors simplify complex operations at the expense of making adding and subtracting more difficult
![Page 40: Techniques for Low Power Turbo Coding in Software Radio Joe Antoon Adam Barnett](https://reader030.vdocuments.net/reader030/viewer/2022032606/56649e935503460f94b98d3c/html5/thumbnails/40.jpg)
Conclusion
• Using a LNS processor with slight modifications can reduce power consumption by 27.6%
• Overall latency is also reduced due to ease of complex operations in LNS processor when compared to FXP or FLP processors.