gpu-accelerated sdr implementation of a multi-user detector for ...€¦ · in this session a novel...
TRANSCRIPT
![Page 1: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/1.jpg)
GPU-accelerated SDR Implementation of Multi-User Detector for Satellite Return Links
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 1
Chen Tang [email protected]
Institute of Communication and Navigation
German Aerospace Center
![Page 2: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/2.jpg)
Preamble
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 2
• German Aerospace Center • National aeronautics and space research center of Germany
![Page 3: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/3.jpg)
Preamble
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 3
• German Aerospace Center • National aeronautics and space research center of Germany • Wide range of R&D projects in national and international partnerships
• DLR & NASA operate the flying infrared telescope SOFIA • DLR operates/coordinate the Columbus (European lab module on ISS) • Galileo satellite navigation system
![Page 4: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/4.jpg)
• The work presented here has been developed in the scope of
NEXT (Network Coding Satellite Experiment) project funded by German Space Agency
that paved the way to the GEO research communication satellite H2Sat (2017)
H2Sat: explore and test new broadband (high data rate) satellite communication
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 4
Preamble
![Page 5: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/5.jpg)
Overview
• What Problems? • Introduction and Motivation
• How to Solve?
• Multi-User Detection (MUD) System Design • GPU-accelerated SDR Implementation of MUD
• Result and Outlook
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 5
![Page 6: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/6.jpg)
Overview
• What Problems? • Introduction and Motivation
• How to Solve?
• Multi-User Detection (MUD) System Design • GPU-accelerated SDR Implementation of MUD
• Result and Outlook
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 6
![Page 7: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/7.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 7
• Unidirectional satellite broadcast service
![Page 8: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/8.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 8
• Bidirectional satellite communication • Forward link • Return link • e.g. internet over satellite;
interactive satellite TV services
• Multi-user access issue
![Page 9: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/9.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 9
• Bidirectional satellite communication • Forward link • Return link • e.g. internet over satellite;
interactive satellite TV services
• Multi-user access issue
![Page 10: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/10.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 10
• Bidirectional satellite communication • Forward link • Return link • e.g. internet over satellite;
interactive satellite TV services
• Multi-user access issue
• Multi-access schemes: Time Division Multiple Access
TDMA
f
t
![Page 11: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/11.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 11
• Bidirectional satellite communication • Forward link • Return link • e.g. internet over satellite;
interactive satellite TV services
• Multi-user access issue
• Multi-access schemes: Frequency Division Multiple Access
FDMA
f
t
![Page 12: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/12.jpg)
Introduction and Motivation
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 12
• Bidirectional satellite communication • Forward link • Return link • e.g. internet over satellite;
interactive satellite TV services
• Multi-user access issue
• Scarcity and high cost of satellite frequency spectrum (millions of dollars) • How to improve spectrum efficiency?
• Multi-User Detection (MUD)
MF-TDMA (e.g. DVB-RCS)
f
t
![Page 13: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/13.jpg)
Overview
• What Problems? • Introduction and Motivation
• How to Solve?
• Multi-User Detection (MUD) System Design • GPU-accelerated SDR Implementation of MUD
• Result and Outlook
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 13
![Page 14: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/14.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 14
Multi-User Detection (MUD) System
• Multiple users transmit at the same frequency and time
• A transparent satellite return link
• Main objectives: • Develop a MUD receiver • Increase decoding throughput real-time processing
• Multiuser Detection (MUD) • Increase spectrum efficiency • Few practical MUD implementations for satellite systems
• High complexity • Sensitive to synchronization and channel estimation errors
![Page 15: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/15.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 15
MUD System Design
• Successive Interference Cancellation (SIC) • Sequentially decode users & cancel interference
• Linear complexity on number of users
• Straightforward extension to support more users
p
f
user 1 user 2
transmit user 2 for “free”
![Page 16: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/16.jpg)
Overview
• What Problems? • Introduction and Motivation
• How to Solve?
• Multi-User Detection (MUD) System Design • GPU-accelerated SDR Implementation of MUD
• Result and Outlook
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 16
![Page 17: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/17.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 17
MUD System Design
• SDR = Software Defined Radio • Components (e.g. filter, amplifier, modulator etc.) in a communication system are implemented via software • Benefits vs hardware-based devices:
• Flexible to change • Lower cost • Shorter development time
• Drawback vs hardware-based devices:
• Low processing power
![Page 18: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/18.jpg)
• Programmable radio devices
• DSP (Digital Signal Processor) • FPGA (Field Programmable Gate Arrays) • SoC (Programmable System on Chip) • GPGPU (General-Purpose GPU)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 18
SDR
![Page 19: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/19.jpg)
• Restriction of FPGA-based SDR • Long development time and complexity • No standardized protocols, interfaces or architectures less portable
• Nvidia CUDA GPU-based SDR • High performance
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 19
GPU-based SDR
Nvidia Tesla c2070: 448 cores; 515 GFLOPs of double-precision peak performance
![Page 20: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/20.jpg)
• Restriction of FPGA-based SDR • Long development time and complexity • No standardized protocols, interfaces or architectures less portable
• Nvidia CUDA GPU-based SDR • High performance • Less effort to develop • Unified architecture more portable
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 20
GPU-based SDR
Ref: GPU vs FPGA for high productivity computing, 2010 (David H. Jones, A. Powell, C. Bouganis, Peter Y.K. Cheung)
GPU: Nvidia GTX285 HC1: 5 x Virtex-5 FPGA
![Page 21: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/21.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 21
MUD System Design
• Real-time implementation of MUD is challenging • 𝑇𝑑𝑑𝑑 ≤ 𝑇𝑓𝑓𝑓𝑓𝑑
• Processing bottlenecks:
• LDPC channel decoding • EM channel estimation • Resampling and interference cancellation
LDPC
U1: n = 4800 k = 3200
𝐶𝑗 → 𝑉𝑖
C 1 C 2 C 3 C n - k
V 1 V 2 V 3 V 4 V n
… ...
… ... 𝑉𝑖 → 𝐶𝑗
U2: n = 4800 k = 2400
![Page 22: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/22.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 22
GPU-based MUD
Processing bottlenecks To be accelerated by GPU
LDPC Channel Decoding 4800 nodes to be processed iteratively
EM Channel Estimation Thousands-points FFT iteratively
Interference Cancellation Resampling, thousands-points FFT
![Page 23: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/23.jpg)
MUD receiver on GPU
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 23
• Processing bottlenecks: • LDPC channel decoding • EM channel estimation • Resampling and interference cancellation • Data transfer between host and device memory
(144GB/s of Nvidia Tesla vs. 8GB/s of PCIe*16)
• All parts of each single user receiver and interference cancellation on GPU
• Minimize the latency of intermediate data transfer between host and device memory
GPUCPU
GPUCPU
GPUCPU
GPUCPU
![Page 24: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/24.jpg)
Overview
• What Problems? • Introduction and Motivation
• How to Solve?
• Multi-User Detection (MUD) System Design • GPU-accelerated SDR Implementation of MUD
• Result and Outlook
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 24
![Page 25: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/25.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 25
Simulation Setup
• GPU Nvidia Tesla c2070 (1.15GHz, CUDA compatibility: 2.0) • Comparison benchmark: Intel Xeon CPU E5620 (2.4GHz)
• Channel coding: LDPC
• Irregular Repeat Accumulate • Blocklength: 4800 bits • U1 coderate: 2/3 , U2 coderate: 1/2
• Baud-rate: 62500 symbols/second real-time decoding threshold: ca. 85ms (66 kbps)
![Page 26: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/26.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 26
Simulation Result
Comparison of total processing time of MUD between CPU and GPU
![Page 27: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/27.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 27
Simulation Result
Comparison of total processing time of MUD between CPU and GPU
![Page 28: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/28.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 28
Simulation Result
Real-time threshold
![Page 29: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/29.jpg)
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 29
Summary
• SDR implementation of MUD receiver • High flexibility and low cost • Extension to support more users
• GPU acceleration • 1.8x ~ 3.8x faster than the real-time decoding threshold • Still space to improve • New GPU better performance
• GPU CUDA is very promising for powerful parallel computing • Low learning curve • Heterogeneous: mixed serial-parallel programming • Scalable
• Days/weeks of simulation hours
![Page 30: GPU-Accelerated SDR Implementation of a Multi-User Detector for ...€¦ · In this session a novel GPU-based Software Defined Radio \(SDR\) implementation of a Multi-User Detector](https://reader036.vdocuments.net/reader036/viewer/2022071005/5fc214d8c7badc084c253736/html5/thumbnails/30.jpg)
Thank you very much! Q&A
> GTC 2014 > Chen Tang > 03.2014 DLR.de • Chart 30