timing delay characterization of gnu radio based 802.15.4

106
IN DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS , STOCKHOLM SWEDEN 2018 Timing delay characterization of GNU Radio based 802.15.4 network using LimeSDR SAPTARSHI HAZRA KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Upload: others

Post on 03-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Timing delay characterization of GNU Radio based 802.15.4

IN DEGREE PROJECT ELECTRICAL ENGINEERING,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2018

Timing delay characterization of GNU Radio based 802.15.4 network using LimeSDR

SAPTARSHI HAZRA

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE

Page 2: Timing delay characterization of GNU Radio based 802.15.4

ii

Abstract

Massive deployment of diverse ultra-low power wireless devices necessitatesthe rapid development of communication protocols. Software Defined Radio(SDR) provides a flexible platform for deploying and evaluating real-world per-formance of these protocols. But SDR platform based communication systemssuffer from high and unpredictable delays. There is a lack of comprehensive un-derstanding of the characteristics of the delays experienced by these systems fornew SDR platforms like LimeSDR. This knowledge gap needs to be filled in orderto reduce these delays and better design protocols which can take advantage ofthese platforms.

We design a GNU Radio based IEEE 802.15.4 experimental setup, where thedata path is time-stamped at various points of interest to get a comprehensiveunderstanding of the characteristics of the delays. Our analysis shows GNU Ra-dio processing and LimeSDR buffering delay are the major delays in these datapaths. We try to decrease the LimeSDR buffering delay by decreasing the USBtransfer size but it comes at the cost of increased processing overhead. The USBtransfer packet size is modified to investigate which USB transfer size providesthe best balance between buffering delay and the processing overhead across twodifferent host computers.

Our experiments show that for the best-measured configuration the mean andjitter of latency decreases by 37% and 40% respectively for the host computer withhigher processing resources. We also show that the throughput is not affected bythese modifications. Higher processing resources help in handling higher pro-cessing overhead and can better reduce the buffering delay.

Keywords: Software Defined Radio; LimeSDR; GNU Radio; Latency; IEEE 802.15.4;USB Transfer Delay; USBMon

Page 3: Timing delay characterization of GNU Radio based 802.15.4

Sammanfattning

Stora installationer av heterogena extremt energisnåla trådlösa enheter ställerkrav på snabb utveckling av kommunikationsprotokoll. Mjukvarubaserad radio(Software Defined Radio, SDR) tillhandahåller en flexibel plattform för att in-stallera och utvärdera faktisk prestanda för dessa protokoll. Men SDR-baseradesystem har problem med stora och oförutsägbara fördröjningar. Verklig förståelseav hur dessa fördröjningar beter sig i nya plattform som LimeSDR saknas. Dessakunskapsbrister behöver överbryggas för att kunna minska fördröjningarna ochför att mer framgångsrikt kunna designa protokoll som drar nytta av de nya plat-tformarna.

Vi skapar en försöksuppställning för IEEE 802.15.4 baserad på GNU Radio.Data som passerar systemet tidsstämplas för att ge underlag till att förstå för-dröjningarnas egenskaper. Vår analys visar att fördröjningarna främst kommerfrån processande i GNU-radion och buffertider för LimeSDR. Vi försöker minskabuffertiderna för LimeSDR genom att minska paketstorleken för USB-överföring,men det kommer till priset av ökade bearbetningskostnader. Paketstorleken förUSB-överföring modifieras för att på två olika testdatorer undersöka den bästabalansen mellan buffertider och bearbetningskostnader.

Våra experiment visar att för att den mest noggrant undersökta försöksupp-ställningen så minskar medelvärdet och jittret för fördröjningarna med 37% och40% för testdatorn med mest beräkningskraft. Vi visar också att genomströmnin-gen inte påverkas av dessa ändringar. Med mer beräkningskraft kan de ökadebearbetningskostnader hanteras, och buffertiderna kan förkortas mer effektivt.

Nyckelord: Software Defined Radio; LimeSDR ; GNU Radio; Latency ; IEEE802.15.4; USB Transfer Delay

iii

Page 4: Timing delay characterization of GNU Radio based 802.15.4

Acknowledgement

I would firstly like to thank my industrial supervisors Simon Duquennoy andNiklas Wirström for their constant help and guidance throughout the entire courseof my thesis. Their constructive feedback helped me a lot during the writingphase of my thesis. I would also like to thank Peng Wang and Marina Petrovafor the active discussions which helped me refine my work and the overall di-rection of my thesis work. I am thankful to the entire Networked EmbeddedSystems group at RISE SICS for making me feeling welcome and all those en-gaging lunchtime conversations. I am indebted to EIT Digital for accepting mein their Masters program and their constant support throughout the entire du-ration of my studies. I am grateful to all my friends and family who supportedme throughout my studies. I would like to express my gratitude particularly tomy friends Shoumik, Pradyumna, Sai, Tony, Zeeshan and my opponent Jasperfor giving me their time for the proofreading sessions and demo presentations.Finally, I would like to thank all those people who take their time to write onlinetutorials and articles to help novices like me understand a subject as complex asSoftware Defined Radios.

iv

Page 5: Timing delay characterization of GNU Radio based 802.15.4

Contents

1 Introduction 11.1 Problem Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Project Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background 62.1 Essential Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 PHY and MAC Layers . . . . . . . . . . . . . . . . . . . . . . 62.1.2 SDR Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . 72.1.3 GNU Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 LimeSDR-USB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.1 LimeSDR-USB Hardware Architecture . . . . . . . . . . . . . 16

2.2.1.1 LMS7002M . . . . . . . . . . . . . . . . . . . . . . . 162.2.1.2 FPGA . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2.1.3 Cypress EZ-USB FX3 . . . . . . . . . . . . . . . . . 18

2.2.2 LimeSDR USB Software Architecture . . . . . . . . . . . . . 212.3 IEEE 802.15.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.4 Wime Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.5 Tools Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5.1 USBMon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.5.2 Pidstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 Literature Study 31

4 Timing Characterization of the LimeSDR-USB platform 344.1 Methodology and Literature Reflection . . . . . . . . . . . . . . . . 344.2 Experimental Setup and Implementation . . . . . . . . . . . . . . . 35

4.2.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 354.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 39

v

Page 6: Timing delay characterization of GNU Radio based 802.15.4

vi CONTENTS

4.3 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.3.1 Latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3.2 Component Delays . . . . . . . . . . . . . . . . . . . . . . . . 43

4.3.2.1 Component Time Delays Measurement . . . . . . . 444.3.2.2 Measuring T4 and T5 . . . . . . . . . . . . . . . . . 464.3.2.3 Data Correlation . . . . . . . . . . . . . . . . . . . . 50

4.4 Experiment 1: Impact of network parameters on latency . . . . . . . 514.5 Experiment 2: Analysis of component delays . . . . . . . . . . . . . 524.6 Experiment 3: Impact of USB Transfer Size on the component delays 53

5 Mitigation Methods 555.1 LimeSDR loopback delay . . . . . . . . . . . . . . . . . . . . . . . . 55

5.1.1 Analytical understanding of the LimeSDR FPGA RX datapath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565.1.1.1 RX Path: . . . . . . . . . . . . . . . . . . . . . . . . . 575.1.1.2 RX Path Delays . . . . . . . . . . . . . . . . . . . . . 58

5.1.2 Decreasing ∆total . . . . . . . . . . . . . . . . . . . . . . . . . 605.2 Software Chain Delays . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3 Experiment 4: Evaluation of Mitigation Methods . . . . . . . . . . . 625.4 Experiment 5: Throughput Analysis . . . . . . . . . . . . . . . . . . 63

6 Results and Analysis 656.1 Impact of data payload size and sampling rate on the overall latency 656.2 Component Delay Analysis . . . . . . . . . . . . . . . . . . . . . . . 676.3 Impact of USB Transfer size . . . . . . . . . . . . . . . . . . . . . . . 696.4 Evaluation of mitigation strategies . . . . . . . . . . . . . . . . . . . 72

6.4.1 Impact on latency . . . . . . . . . . . . . . . . . . . . . . . . . 726.4.2 Impact on the component delays . . . . . . . . . . . . . . . . 746.4.3 Influence of processing resources . . . . . . . . . . . . . . . . 77

6.5 Effect on throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7 Conclusion 807.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Bibliography 83

Page 7: Timing delay characterization of GNU Radio based 802.15.4

CONTENTS vii

A Background 86A.1 CSMA and TDMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86A.2 GNU Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

A.2.1 GNU Radio Block Types . . . . . . . . . . . . . . . . . . . . . 87A.2.2 GNU Radio Interfaces . . . . . . . . . . . . . . . . . . . . . . 88

A.3 LMS7002M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89A.4 USBMon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

A.4.0.1 Text Data Format . . . . . . . . . . . . . . . . . . . . 91A.4.0.2 Raw Binary . . . . . . . . . . . . . . . . . . . . . . . 92

Page 8: Timing delay characterization of GNU Radio based 802.15.4

List of Figures

1.1 Software Radio and Traditional Radio Architecture. . . . . . . . . . 21.2 Problem Context Illustration (adapted from [23]). . . . . . . . . . . 31.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1 Host-PHY [20] Software Defined Radio (SDR) architecture. . . . . . 92.2 Simple Radio Frequency receiver (adapted from [25]) . . . . . . . . 102.3 Direct Digital Synthesis (adapted from [9]) . . . . . . . . . . . . . . 102.4 Example of GNU Radio flow graph. . . . . . . . . . . . . . . . . . . 142.5 Architecture of GNU Radio blocks(Adapted from Johnathan Cor-

gan slides [6]) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.6 Block Diagram of LimeSDR-USB. . . . . . . . . . . . . . . . . . . . . 172.7 EZ-FX3 architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.8 LimeSDR USB software architecture . . . . . . . . . . . . . . . . . . 222.9 Streamer Transmit Loop . . . . . . . . . . . . . . . . . . . . . . . . . 232.10 LMS Control Packet Structure . . . . . . . . . . . . . . . . . . . . . . 232.11 MAC Data Frame Structure . . . . . . . . . . . . . . . . . . . . . . . 252.12 O-QPSK PHY Packet . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.13 In-Phase and Quadrature Chip Sequences . . . . . . . . . . . . . . . 272.14 GNU Radio Modulation flow graph . . . . . . . . . . . . . . . . . . 272.15 GNU Radio Demodulation flow graph . . . . . . . . . . . . . . . . . 282.16 USBMon Architecture(Adapted from [2]). . . . . . . . . . . . . . . . 29

4.1 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.2 Experimental Setup Flow Diagram . . . . . . . . . . . . . . . . . . . 374.3 Dataflow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.4 Radio Silent Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.5 Periodic Message Source . . . . . . . . . . . . . . . . . . . . . . . . . 414.6 Component Time Delay Measurement Setup . . . . . . . . . . . . . 454.7 Overview of the timestamps. . . . . . . . . . . . . . . . . . . . . . . 464.8 Timing Measurement Program flow chart . . . . . . . . . . . . . . . 48

viii

Page 9: Timing delay characterization of GNU Radio based 802.15.4

LIST OF FIGURES ix

4.9 USB Data Analysis Program Structure . . . . . . . . . . . . . . . . . 484.10 TimeStamp Correlation problem . . . . . . . . . . . . . . . . . . . . 50

5.1 LimeSDR FPGA RX Path . . . . . . . . . . . . . . . . . . . . . . . . . 565.2 LimeSDR RX data path timing diagram . . . . . . . . . . . . . . . . 59

6.1 Experiment 1: Latency vs Sampling Rates, Message Sizes for lap-top(lower processing resources) . . . . . . . . . . . . . . . . . . . . . 66

6.2 Experiment 1: Latency vs Sampling Rates, Message Sizes for desk-top(higher processing resources) . . . . . . . . . . . . . . . . . . . . 66

6.3 Experiment 2: Component Delays vs Message Sizes for laptop(lowerprocessing resources) . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.4 Experiment 2: Component Delays vs Message Sizes for desktop(higherprocessing resources) . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.5 Experiment 2: Component Delays vs Sampling Rates for the desk-top computer(higher processing resources) . . . . . . . . . . . . . . 70

6.6 Experiment 3: Mean Latency vs Batchsize of LimeSDR FPGA pack-ets (1 FPGA packet = 4096 bytes) . . . . . . . . . . . . . . . . . . . . 70

6.7 Experiment 3: Component Delays vs Batchsize of LimeSDR FPGApackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.8 Experiment 4: Latency vs LimeSDR FPGA packet size for the lap-top (lower processing resources) . . . . . . . . . . . . . . . . . . . . 72

6.9 Experiment 4: Latency vs LimeSDR FPGA packet size for the desk-top computer (higher processing resources) . . . . . . . . . . . . . . 73

6.10 Experiment 4: Component Delays vs LimeSDR FPGA packet sizefor the desktop computer (higher processing resources) . . . . . . . 74

6.11 Experiment 4: RX Software Chain Component Delays vs LimeSDRFPGA packet size for the desktop computer (higher processing re-sources) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.12 Experiment 4: TX Software Chain Component Delays vs LimeSDRFPGA packet size for the desktop computer (higher processing re-sources) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.13 Experiment 4: Comparison of the percentage of CPU usage acrossdifferent LimeSDR FPGA packet sizes for both the host computers 77

6.14 Experiment 5: Comparison of throughput for the best latency caseand the default configuration using the desktop computer. . . . . . 78

A.1 CSMA flow graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87A.2 Block Diagram of LMS7002M. . . . . . . . . . . . . . . . . . . . . . . 89

Page 10: Timing delay characterization of GNU Radio based 802.15.4

List of Tables

2.1 LimeSDR-USB specifications . . . . . . . . . . . . . . . . . . . . . . . 162.2 LimeSDR USB transfer endpoints . . . . . . . . . . . . . . . . . . . . 20

4.1 Hardware Specifications . . . . . . . . . . . . . . . . . . . . . . . . . 384.2 Software Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5.1 Min-Max analysis for different values of N . . . . . . . . . . . . . . 61

A.1 GNU Radio Block Types . . . . . . . . . . . . . . . . . . . . . . . . . 88A.2 Text USB Trace Example. . . . . . . . . . . . . . . . . . . . . . . . . . 91A.3 URB Type and Direction. . . . . . . . . . . . . . . . . . . . . . . . . . 92

x

Page 11: Timing delay characterization of GNU Radio based 802.15.4

Acronyms

6LoWPAN IPv6 over Low-Power Wireless Personal Area NetworkACK AcknowledgementAPI Application Programming InterfaceARQ Automatic Repeat reQuestASIC Application Specific Integrated Circuit.ASK Amplitude Shift KeyingBPSK Binary Phase Shift KeyingCPU Central Processing UnitCRC Cyclic Redundancy CheckCSMA Carrier Sense Multiple AccessCSMA/CA Carrier-sense multiple access with collision avoidanceCSMA/CD Carrier-sense multiple access with collision detectionCSS Chirp Spread SpectrumDAC Digital Analog ConverterDDR Double Data RateDDS Direct Digital SynthesisDMA Direct Memory AccessDSP Digital Signal ProcessorFCS Frame Control SequenceFFD Full Function DeviceFIFO First In First OutFIR Finite Impulse ResponseFPGA Field Programmable Gate ArrayFPRF Field Programmable RFFSM Finite State MachineGPIF General Programmable InterfaceGPMC General Purpose Memory ControllerGTS Guaranteed Time SlotI2C Inter-Intergrated Circuit

xi

Page 12: Timing delay characterization of GNU Radio based 802.15.4

xii LIST OF TABLES

IFS Inter Frame SpacingIO Input OutputIoT Internet of ThingsIQ In-Phase Quadrature.L2 Layer 2LNA Low noise amplifierLPWAN Low Power Wide Area NetworkLQI Link Quality InformationLR-WPAN Low-Rate Wireless Personal Area NetworkMAC Medium Access ControlMIMO Multiple Input Multiple Output.NCO Numerically Controlled OscillatorNIC Network Interface ControllerO-QPSK Offset Quadrature Phase Shift KeyingOSI Open Systems InterconnectionPAN Personal Area NetworkPCIe Peripheral Component Interconnect ExpressPDU Packet Data Unit.PGA Programmable Gain Amplifier.PHY PhysicalPLL Phased Lock LoopPN Pseudo NoiseRAM Random Access MemoryRAT Radio Access TechnologyRF Radio FrequencyRFD Reduced Function DeviceSDR Software Defined RadioSFD Start-of-Frame DelimiterSHR Synchronization HeaderSPI Serial Peripheral Interface.TDMA Time Division Multiple AccessTSP Transreceiver Signal ProcessorUHD USRP Hardware DriverUSB Universal Serial BusUSRP Universal Serial Radio Peripheral

Page 13: Timing delay characterization of GNU Radio based 802.15.4

Chapter 1

Introduction

The Internet of Things (IoT) is enabling communication among huge numbersof diverse low power devices. According to an estimate by Ericsson [15], therewill be 20 billion connected IoT devices by 2023. Modern communication pro-tocols need to evolve rapidly to enable reliable connection among these devices.The communication needs for a field temperature sensor differ from those of anindustrial controller. Hence, there is a need for research and development ofcommunication protocols that satisfy these diverse device communication needs.The evaluation of these experimental protocols is difficult because of the need forspecialized radio hardware. Simulation is widely used to evaluate these proto-cols but they fall short on modeling of real-world performance. SDR devices canbe a powerful platforms for enabling the real-world evaluation of these protocols.

SDR are flexible radio platforms where most of the communication systemsfunctionality is designed in software. Typically, SDR platforms have onboard ra-dio front-end equipped with wideband antennas and analog signal processingchain for tuning the carrier frequency and desired bandwidth. High speed dataconverters convert the incoming analog signals into the digital domain and viceversa. In traditional radios, the digital processing chain of a wireless protocolphysical layer is implemented on the same chip as the radio front-end and analogsignal processing functions. SDR, on the other hand, in host-PHY [20] architec-ture transfers the converted data to a general purpose computing platform usingbus transfer (USB, PCIe). The digital processing chain is designed in software,thus allowing for flexibility in the protocol design, enabling experimentation indecoding and modulation techniques. SDR also allows for careful analysis of RFsignals as the raw sample data is made available to the host. The key differencebetween SDR and traditional radio has been shown in Figure 1.1. In both these

1

Page 14: Timing delay characterization of GNU Radio based 802.15.4

2 CHAPTER 1. INTRODUCTION

radios data converters and analog front-end are designed on chip, with the tradi-tional radios also including digital signal processing for a specific protocol on thesame chip.

Figure 1.1: Software Radio and Traditional Radio Architecture.

The movement of digital signal processing functions from hardware to soft-ware leads to performance issues in SDR systems. A fundamental challenge ofan SDR system is computational horsepower because it needs to process complexdata wave-forms in a reasonable time frame. Since SDR involves the transfer ofsignals and data from one system to another, considerable communication de-lays are introduced. Finally, general purpose processing systems introduce non-determinism in data processing, and communication processes.

1.1 Problem Context

Wireless devices share the wireless channel with other devices. The wirelessprotocol’s Medium Access Control (MAC) layer is responsible for moderatingaccess to the wireless channel. It typically uses Time Division Multiple Access(TDMA) or Carrier Sense Multiple Access (CSMA) to allocate the use of the chan-nel. TDMA protocols schedule the allocation of the entire channel to one of thedevices for a particular time duration. This requires global time synchronizationamong the devices so that the devices can understand when to transmit and re-ceive. CSMA, on the other hand, uses the channel on an opportunistic basis, withthe devices sensing if the channel is free or not.

Page 15: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 1. INTRODUCTION 3

As highlighted by [23], SDR based systems don’t comply with the stringenttiming constraints imposed by modern MAC protocols. Furthermore, the pres-ence of long bus communication and processing delays create blind spots [23] incarrier sensing. In Figure 1.2, the two problems originating from long delays areshown. The left hand side image shows the problem of blind spots when usingchannel sensing for packet transmission. The SDR system wants to transmit apacket, so it uses channel sensing to detect when the channel is free. An over theair packet ends at time t0, this information is detected by the Central ProcessingUnit (CPU) of the SDR system after a certain delay, the channel sensing processkeeps checking on the availability of the channel. The channel information fromtime t1 is used by the CPU to infer that the channel is free and then starts thetransmission. The TX packet reaches the air medium at time t2. In case of SDRsystems this delay (t2 − t1), can be quite significant. This can result in collisionwith other nodes transmitting because of the communication and processing de-lays, the system is blind to real-time channel situation. In case of regular radiochips, the channel sensing logic is located very close to the Radio Frequency (RF)front-end, and the delay t2 − t1 is quite negligible, thereby resulting in smallerblind spots.

Another problem is illustrated in the right hand side image of Figure 1.2. Be-cause of larger communication and processing delays, it takes significantly longerfor the SDR system to acknowledge a received packet. This acknowledgementtime is referred to as Inter Frame Spacing (IFS). The software implementationalso introduces jitter in the IFS. For a low powered radio node in IoT applica-tions, this jitter causes the radio to be turned on for significantly longer time.The battery is drained away faster thereby undermining the purpose of minimalmaintenance in IoT systems. Another problem with longer IFS is lower spectralefficiency where the radio channel is unused for long periods of time.

Figure 1.2: Problem Context Illustration (adapted from [23]).

Page 16: Timing delay characterization of GNU Radio based 802.15.4

4 CHAPTER 1. INTRODUCTION

The main problem with SDR systems is long processing and buffering delays,which makes them non-compliant with most wireless protocols. The character-istics of these delays need to be understood to design MAC protocols which cantake advantage of the flexibility provided by the SDR platform taking into ac-count the limitations of the platform.

1.2 Project Context

The project was conducted at RISE SICS as part of the 5G-Coral project [1]. 5G-Coral is an European Union H-2020 project which envisions a convergent radioaccess network. The project visualizes numerous small multi-Radio Access Tech-nology (RAT) gateway handling traffic from different devices running differentprotocols to enable convergent access. For the feasibility of this goal, the cost ef-fectiveness of the radio-head needs to be taken into account. LimeSDR [18] pro-vides a cost-effective SDR platform, which supports the desired frequency bandsmaking it the ideal choice as the project’s radio-head.

Low power wireless devices are one of the main focus areas for the 5G-Coralproject. IEEE 802.15.4 is the most popular network specifications for Low PowerWide Area Network (LPWAN) i.e IoT systems. It specifically defines the physicallayer and the MAC layer of the network stack. In order to successfully deployflexible radio-head for LPWAN devices, we need to understand the limitationsand timing bottlenecks for this kind of system. The lack of previous studies on theLimeSDR platform presents a knowledge gap which should be filled for futuredeployments using that platform.

1.3 Research Question

Taking into consideration the problem and project context, we formulated this re-search question: What are the characteristics of the delays introduced by differentcomponents in LimeSDR based IEEE 802.15.4 networks?

The research question is explained using Figure 1.3 where the main compo-nents of the overall delay have been highlighted as host side processing, buscommunication delay, and SDR processing delay. Each of these components in-troduces a mean delay and jitter represented using yellow and red respectively inFigure 1.3. The host side processing delay is introduced by running the softwareimplementation of the 802.15.4 Physical (PHY) and MAC layers. The bus com-

Page 17: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 1. INTRODUCTION 5

Figure 1.3: Research Question

munication delay represents the delay caused by the Universal Serial Bus (USB)3.0 bus transfers. The SDR processing delay is the time required by the LimeSDRplatform for transmission or reception of radio signals. The objective of this thesisis to quantitatively evaluate these delays, ∆T1, ∆T2, ∆T3 etc as shown in Figure1.3, and the impact of different network and SDR configuration parameters onthese delays.

1.4 Thesis Outline

The thesis is structured as follows. Chapter 2 introduces the relevant backgroundinformation for understanding the rest of the report. Relevant previous work willbe introduced in Chapter 3. Chapter 4 introduces the experimental setup and themethods used in the measurement of the timing delays. The methods used inmitigation of these delays will be introduced in Chapter 5. Chapter 6 presents theresults of the experiments and the subjective analysis of these results. Finally,Chapter 7 includes the concluding remarks and scope of future work.

Page 18: Timing delay characterization of GNU Radio based 802.15.4

Chapter 2

Background

This chapter introduces the necessary background information needed for un-derstanding the remainder of the report. The first section provides a broad intro-duction to SDR systems, GNU Radio Software Tool, and PHY and MAC layers ofthe network stack. The second section introduces the LimeSDR platform and itshardware and software architecture. IEEE 802.15.4 is explained in the third sec-tion followed by a description of the Wime project in the fourth section. Finally,the fifth section introduces the tools used in the methods chapter.

2.1 Essential Concepts

This section introduces generalized concepts for understanding communicationsystem design using SDR platforms. In the first subsection, PHY and MAC layersof the network stack are described followed by general architecture and function-ing of SDR platforms. Finally, since this project uses GNU Radio as the softwareframework we provide the background information necessary for understandingour methods.

2.1.1 PHY and MAC Layers

Open Systems Interconnection (OSI) Model [17] presents the abstract model fornetworking used for most communication systems design. The abstract model isdivided into seven layers, where the functionality of each layer is implementedseparately and interacts directly with the layer beneath it. Data from the user ap-plication is encapsulated by each subsequent layers of the OSI model into theirframe format. These frames carry metadata using frame headers. Different pro-tocols use different frame formats and headers, as it helps differentiate one proto-

6

Page 19: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 7

col from another. These headers help the receiver in learning where the incomingdata packet is coming from, whom it is meant for, how to decode and arrange thecontents of the data packets etc.

The PHY layer is the lowest layer of the OSI model, it interacts with the phys-ical communication channel directly. It specifies the type of data transfer (seri-al/parallel) and the data rate of the protocol. The PHY layer defines the processof transmitting raw bits through the physical medium. The bitstream is groupedinto code words and converted to symbols, which are then modulated to a phys-ical signal for transmission over the transmission medium. The PHY layer alsoprovides physical transmission link information like carrier sense, collision de-tection, and Link Quality Information (LQI) to the upper layers.

The MAC layer, the second layer of the OSI model, is responsible for definingthe methods to share and use the common transmission medium among multi-ple devices. In case of outgoing packets, the MAC layer adds the MAC address ofthe destination device to the packet header. It adds the synchronization preambleand Frame Control Sequence (FCS) for checking transmission error. Retransmis-sions in case of dropped packets and acknowledgements to successfully receivedpackets are handled by this layer.

2.1.2 SDR Platforms

SDR presents a new paradigm of communication system design where the sys-tem is flexible to adapt to the needs of the end user, as also the radio channelconditions. Nychis et.al [20] classifies SDR based communication systems intotwo main architectures.

• Host-PHY Architecture This is the most common architecture, enabling de-sign and development of the entire system in software. It provides the max-imum flexibility in terms of design and implementation choices. Also, thereis the added benefit of easy upgrades. However, since the system is de-signed in software, the processing and communication delays make mostmodern MAC protocols unfeasible in this architecture.

• NIC-PHY Architecture In this architecture most of the PHY layer func-tionality is implemented in Field Programmable Gate Array (FPGA) andDigital Signal Processor (DSP). The close proximity to the radio hardwareand specialized parallel hardware processing makes this architecture most

Page 20: Timing delay characterization of GNU Radio based 802.15.4

8 CHAPTER 2. BACKGROUND

suitable for running the modern MAC protocols. But the design processfor this architecture based systems is time consuming and difficult, as tra-ditionally hardware programming is harder than software programming.However, this architecture is much more flexible compared to commercialNetwork Interface Controller (NIC). Wireless Open Access Research Plat-form(WARP) [29] is an example of a system based on this type of architec-ture.

Since host-PHY [20] is the commonly used architecture as shown by different usecases such as RFID [5], 802.11 [4] and cellular networks [8], the report concen-trates on explaining the functionality of SDR systems using this architecture. Fig-ure 2.1 shows the typical design of communication systems in this architecture.The system can be broadly divided into two main components:

1. SDR Platform.

2. Host Computer.

Although the process of transmission and reception are logically reverse ofone another, the reception process is more complicated. The flexibility of SDRplatforms can be attributed mainly to the flexibility provided by the receive chainof these platforms. Hence, this report concentrates on explaining the receptionprocess of SDR platforms.

SDR Platform It is the hardware that provides access to the wireless mediumin a flexible manner. RF signals are transmitted and received by the platform. Itconverts these analog signals to digital samples and transfers them to the hostcomputer. The main building blocks of these platforms are shown in Figure 2.1.

• Software Configurable RF trans-receiver This is the heart of SDR platformsand provides RF modulation and demodulation capability. They are at-tached to wideband antennas for receiving and transmitting over a broadrange of frequencies. Taking the case of reception of RF signal, the signalreceived from the antenna is amplified by a Low noise amplifier (LNA).The LNA amplifies a low power signal without significantly degrading thesignal to noise ratio. Once amplified, the signal is passed to an RF receiver,where the RF signal is demodulated either to an Intermediate Frequency(IF)or baseband signal depending on whether the receiver is a zero-IF receiveror superheterodyne receiver respectively.

Page 21: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 9

Figure 2.1: Host-PHY [20] SDR architecture.

Figure 2.2 shows a simple RF receiver in which the LNA output signal is fedto a mixer. The mixer is a signal processing block used for translating theinput signal to another frequency range. The mixer uses a locally generatedcarrier frequency for the translation. If the input RF signal has a frequencyof frf , and the local oscillator frequency is flo then the mixer will produce asignal with frequency components frf − flo and frf + flo. This output signalis then passed through a bandpass filter with frf − flo as center frequency,this will reject the unwanted frf + flo. If we assume frf = flo, the band

Page 22: Timing delay characterization of GNU Radio based 802.15.4

10 CHAPTER 2. BACKGROUND

Figure 2.2: Simple Radio Frequency receiver (adapted from [25])

pass filter will be equivalent to a low pass filter and the output signal willbe the baseband signal. This is the case in Zero-IF receiver architecture. Insuperheterodyne receiver architecture, a number of intermediate frequencystages are used before generation of the baseband signal.

In traditional transceiver, a crystal oscillator is used. This results in stabilityof the local oscillator signal but the system is now tuned to a particular fre-quency. With the goal of flexibility in mind, SDR platforms use frequencysynthesizers to generate the local clock signal. Frequency synthesizers areused for creating arbitrary waveforms from a single frequency clock. MostSDR platform uses Direct Digital Synthesis (DDS) as the frequency synthe-sizer, which uses a highly stable oscillator used as a reference signal.

Figure 2.3: Direct Digital Synthesis (adapted from [9])

The main components of a DDS are Numerically Controlled Oscillator (NCO),Digital Analog Converter (DAC) and a tuning word register as shown inFigure 2.3. The NCO is composed of phase accumulator and phase to am-plitude converter. In each clock cycle, the phase accumulator increases the

Page 23: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 11

output (phase) by the value stored in the tuning word register. This out-put is the input to the phase to amplitude converter, which basically is alookup table containing the amplitude for a particular phase. The outputof the phase accumulator is basically the address (phase) for the lookup ta-ble. The output is then converted to an analog signal by the DAC, and thenpassed through a low pass filter to smoothen out the waveform.

Since the tuning word register is responsible for how fast the phase changes,the output sinusoidal frequency can be controlled by controlling the tuningword register, this is how SDR platforms are able to generate a wide rangeof local oscillator frequencies.

In SDR platforms, the filter (shown in Figure 2.2) is implemented usingFinite Impulse Response (FIR) filters. FIR filters are convolutional filterswhere the output response of a particular input is finite. The output of anFIR filter can be adjusted by readjusting the parameters of the filter’s im-pulse response. This allows for the SDR to allow certain range of frequen-cies in the output, which can be adjusted by external control.

• RF Transceiver Controller The controller provides the interface to controlthe RF transceiver. The communication system running on the host com-puter, provides the desired RF parameters to the controller. It then trans-lates those instructions to digital signals for configuring the RF transceivermodules like the FIR filter weights, the tunable word register for selectingthe desired frequency and also the desired gain in the programmable gainamplifiers.

• ADC/DAC The filtered analog RF signal needs to be converted to digitaldomain before transferring to the host computer. Fast data converters areused on the SDR platforms for this purpose. The sampling rate of these dataconverters determines the available RF bandwidth of the system accordingto the Nyquist criteria. The resolution of the SDR data converters is im-portant so as to ensure high dynamic range of the received signal, ensuringthat the system is capable of receiving very weak signals, as well very strongsignals without saturation. The sampling rate of these data converters arecontrolled by the RF transceiver controller.

• FPGA The FPGA acts as the glue logic between the data converters and thebus controller. In most cases, the bus communication is bursty in nature,whereas the DAC produces a stream of samples. FPGA provides for effi-cient buffering of these samples, and packs them into bursts to be sent overthe bus. In some platforms, additional information like a sample clock is

Page 24: Timing delay characterization of GNU Radio based 802.15.4

12 CHAPTER 2. BACKGROUND

also packed into these bursts. Some applications might need additional sig-nal processing, FPGA provides an efficient way to implement these filters.In NIC-PHY architecture most of the communication system is designedusing the FPGA.

• Bus Controller It is the bridge for the bursts of data crossing over from theSDR platform to the host computer, and vice versa. It takes in data packetsfrom the FPGA, encodes them with bus transfer protocol, then initiates thetransfer. The flow control and routing for different packets is also handledby the bus controller.

Host Computer The host computer, a general purpose computer, is the brain ofthe communication system. It runs the software implementation of the basebandprocessing for the desired protocol, taking the digital samples from the SDR asinput. During initialization, it configures the SDR platform. Depending on theimplementation, it can have the full network stack, and an application runningon top of it. From an architectural viewpoint, the host computer has three maincomponents as shown in Figure 2.1.

• Bus Controller The bus controller on the host computer controls the otherend of the bus communication link. It decodes the received data bursts andsends them to the driver for further processing. If the bus communicationinvolves a master/slave relationship, the host computer bus controller willbe designated as the master. It initiates the data transfer on the bus, and theslave bus controller responds to the requests placed by this bus controller.

• Driver The driver is the abstraction layer for the SDR platform communi-cation and configuration. The communication system designer should beable to provide high-level instructions to the SDR platform. It is the respon-sibility of the driver to handle the translation of high-level instructions tolow-level register control data words. For example: when tuning the RFtrans-receiver, the system designer would be much more comfortable to sayset the center frequency to 1.8 GHz, rather than saying set register at address"x" to value "data". The driver handles this translation. It also is responsiblefor ensuring a reliable data transfer link. Since the communication systemand the incoming data may be running at different rates, the driver buffersthe incoming data, and provides it to the running communication system atits data processing rate.

• Digital Signal Processing Framework The hardware baseband processingof communication systems are generally designed with concurrent execu-

Page 25: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 13

tion in mind. Whereas, general purpose computing platform are sequentialin nature. Many core processors add the capability of concurrent execution,but at a much smaller scale than what can be achieved with hardware pro-cessing platforms like FPGA and Application Specific Integrated Circuit.(ASIC). So when designing systems on general purpose computing plat-forms, this change in execution model needs to be taken into account.

Threads provide a software method to implement concurrent execution model.However, efficient thread management and synchronization produce sig-nificant overhead. Digital Signal Processing frameworks are designed tohelp system designers to only design their signal processing modules with-out worrying about thread synchronization and management. They are de-signed with concurrent execution of signal processing algorithms, and mod-ularity of system design in mind. Two of the most popular frameworks areGNU Radio and Labview. In the next subsection, the report goes into detailsof GNU Radio.

2.1.3 GNU Radio

GNU Radio is an open-source digital signal processing framework, which hasbeen rapidly evolving with a large active community. The software frameworkcan be used for both simulation, and prototyping real-world application scenar-ios. It provides a graphical interface for designing signal processing chains, aswell as an extensive library of signal processing blocks like: filters, synchroniz-ers, demodulators etc. The ease of use of the framework, extensive library, as wellas hardware support for most SDR platforms has led to diverse application usecases such as RFID [5], 802.11 [4] and cellular networks [8].

GNU Radio is designed to stream large amounts of data in real-time betweenparallel computational nodes. The data flow between the nodes from the sourceto the sink is described by the flow graph, while the flow of data is controlled bythe GNU Radio scheduler. Figure 2.4, shows a simple flow graph where the datafrom the sources node is processed by the signal processing chain. The processeddata is sent to the sink nodes. The signal processing chain itself is composed ofmultiple data processing nodes, for example: the flow graph in Figure 2.4 hasfour nodes in the processing chain. The scheduler schedules the execution ofthese blocks. From the block designer point of view, these blocks can be viewedto be executing concurrently.

Page 26: Timing delay characterization of GNU Radio based 802.15.4

14 CHAPTER 2. BACKGROUND

Figure 2.4: Example of GNU Radio flow graph.

Blocks and Flow Graphs The computational nodes in the flow-graph are theGNU Radio processing blocks. Each block describes how the input elements tothe block, are converted to output elements in the work function.

The flow graph describes the flow of data between different blocks. Flowgraphs make it easier to design complex signal processing algorithms by combin-ing simpler blocks. This provides modularity and scalability to the algorithm de-velopment process. Generally, blocks are designed in C++ to enable fine grainedcontrol, and faster execution. The flow graphs are described as instantiation ofthese C++ blocks with inter block signaling defined by Python’s Qt framework[24].

GNU Radio Block Architecture Figure 2.5 shows the general architecture of aGNU Radio block. Each block has associated buffers for the interfaces, two com-putational components, namely the work function and the message handlerfunction. A runtime scheduler is associated with each block for controlling theexecution of block during runtime. The runtime scheduler has its own signal-ing mechanism for interacting with schedulers of other blocks. These signalingmechanisms are hidden to the flow graph designer and are used for flow control.

Page 27: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 15

Figure 2.5: Architecture of GNU Radio blocks(Adapted from Johnathan Corganslides [6])

The blocks providing the inputs to the current block are called upstream blocks,while those that are fed by the output of this block are called downstream block.

For the sake of simplicity, it is assumed that the current block in Figure 2.5has one upstream block, and one downstream block. This implies that the blockhas a single input buffer and a single output buffer. The work function describesthe implementation of the signal processing algorithm. On starting execution,the work function accesses data from the circular input buffer, which is the sameas the output buffer for the upstream block. Both the upstream block and thecurrent blocks maintain a pointer to the last used data element position. Oncecompletion of execution, the upstream block writes new elements to the inputbuffer of the current block and updates its the write data pointer. The upstreamblock notifies the current block using the notify_downstream() method thatnew input data elements have been written. The scheduler for the current blockchecks if there is sufficient new data for a single execution. It then starts the exe-cution of the current block once the input buffer has sufficient data. On successfulexecution, it updates the read data pointer, writes the data elements into the out-put buffer, and updates the write pointer.

Since most filter design relies on the history of previous inputs, the scheduleralso notifies the upstream block on reading the data from the buffer to notify thatits output might have been modified. The current block scheduler notifies thedownstream block that new data elements are available when it writes to the out-put buffer. The message handler function works similarly. In this case, theupstream block scheduler uses notify_msg() method for signaling that a new

Page 28: Timing delay characterization of GNU Radio based 802.15.4

16 CHAPTER 2. BACKGROUND

message may be available.

Scheduler The scheduler is the control unit for the flow graph. At initialization,the scheduler allocates the buffers and instantiates each block in its own thread.At runtime, it does memory management for each block, determines the require-ments that are set by the block such as the number of items to be processed inone execution, alignment of data in the buffers etc. Once the requirements aresatisfied, it passes the read and write pointers to the work function and starts theone execution. Once the work function finishes its execution, the scheduler takesthe returned information and updates the state of the block and the appropriatepointers.

2.2 LimeSDR-USB

The SDR platform used in this project is LimeSDR-USB. It follows the architec-ture of the SDR platform shown in Figure 2.1. The technical specifications ofLimeSDR-USB, and the component description in reference to Figure 2.1 has beensummarized in Table 2.1. In the next subsections, the report discusses the hard-ware and software architecture of LimeSDR-USB.

Feature DescriptionSoftware Configurable RF Transceiver LMS7002 MIMO FPRF

FPGA Altera Cyclone IV EP4CE40F23Bus Controller Cypress USB 3.0 CYUSB3014-BZXC

Table 2.1: LimeSDR-USB specifications

2.2.1 LimeSDR-USB Hardware Architecture

Figure 2.6 shows the block diagram of a LimeSDR-USB board. For the sake of sim-plicity the diagram shows the major components, namely the LMS7002M FieldProgrammable RF (FPRF) [19], FPGA and the Cypress EZ-USB FX3 bus controller.

2.2.1.1 LMS7002M

LMS7002M is a fully integrated FPRF transceiver providing 2×2 Multiple InputMultiple Output. (MIMO) functionality. It provides continuous coverage in the

Page 29: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 17

Figure 2.6: Block Diagram of LimeSDR-USB.

100 kHZ- 3.8 GHz frequency range, with on chip data converters providing 160MHz RF bandwidth. It is designed for a broad range of applications, rangingfrom broadband wireless communication, cellular communications, SDR appli-cations etc. The hardware mainly consists of three main sub-components: analogprocessing chain, data converters, and digital processing chain.

The LMS7002M analog processing chain is responsible for the RF transceiver’smodulation and demodulation functionality. It uses Zero-IF as the RF transceiver’sarchitecture, with programmable gain controllers and filters for tuning the perfor-mance of the transceiver. It provides fast data converters for converting signalsfrom analog to digital domain and vice versa. The data converters provide a res-olution of 12 bits for a high dynamic range of the incoming and outgoing signals.The digital processing chain provides configurable filters for interpolation, anddecimation of digital samples along with automatic gain control.

LMS7002M interfaces can be segmented into control interfaces and data in-terfaces. The control interfaces are used for initialization, calibration and on thefly reconfiguration of the LMS7002M parameters. The data interface is used forexchanging In-Phase Quadrature. (IQ) samples with the baseband modem. Forthe data interface, LMS7002M implements a 12 bit Double Data Rate (DDR) in-terface for each RX/TX chain. The LMS7002M has an on-chip micro-controllerwhich can be used for configuration and control of the LMS7002M chip. It alsoprovides a Serial Peripheral Interface. (SPI) interface for offloading the control

Page 30: Timing delay characterization of GNU Radio based 802.15.4

18 CHAPTER 2. BACKGROUND

and configuration functionality to the baseband modem.

2.2.1.2 FPGA

The LMS7002M is designed to stream data continuously, whereas the CypressFX3 uses USB 3.0 protocol transmits packets of data. LimeSDR-USB uses an Al-tera Cyclone IV FPGA to buffer the streaming data, converts them to packetsand adds metadata to each packet as recommended by Nychis et.al [20]. Thearchitecture of the FPGA data path blocks is shown in Figure 2.6. The TX pathis responsible for moving data from the USB interface to the LMS7002. The RXPath, on the other hand, controls the reception of samples from the LMS7002Mand subsequent sending to the host computer via the USB interface. The FPGAalso has onboard PLLs, which are configured to be the same as the sample clock.

The FPGA interfaces are designed to handle the segregation of control anddata paths by LMS7002M. The data paths (TX path and RX path) uses a 12-bitparallel interface to stream data to and from the LMS7002M data interface. Thecontrol path of the FPGA has a NIOS processor which interacts directly with theUSB Slave Interface, and controls the RF parameters through an SPI interface.

2.2.1.3 Cypress EZ-USB FX3

LimeSDR-USB uses a Cypress EZ-USB FX3 [11] as USB 3.0 peripheral controller.It has a fully configurable, general programmable interface called the GeneralProgrammable Interface (GPIF) II which allows integration with any processorlike ASIC, FPGA, image sensors etc. It also provides low-speed interfaces likeInter-Intergrated Circuit (I2C) and SPI for the low-speed Input Output (IO) oper-ations.

The hardware architecture of EZ-USB FX3 is shown in Figure 2.7. The "ARM926EJ"is a 32-bit microprocessor operating at 200 MHz. It is responsible for the configur-ing and controlling the distributed Direct Memory Access (DMA) controller andthe peripherals. The "USB" block is a USB 3.0 peripheral controller. It implementsthe USB 3.0 physical layer, and is responsible for handling the communicationwith the host computer. The GPIF interface handles the data streams to and fromthe FPGA of the LimeSDR-USB. It implements a synchronous slave FIFO Inter-face which is controlled by slave FIFO block of the LimeSDR FPGA.

Page 31: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 19

Figure 2.7: EZ-FX3 architecture

For the purpose of this project, the implementation details of the USB andGPIF II interface needs to be explored further. The next two paragraphs, concen-trates on presenting the concrete understanding of these two important interfacesin the context of LimeSDR-USB.

LimeSDR USB endpoints USB endpoint are buffers on the USB device. Theyare a logical abstraction for having multiple parallel data streams use a singlephysical channel. The LimeSDR-USB uses four different endpoints for the USBdata transfer. These endpoints are divided into control and data endpoints forboth the input and output directions. The control endpoints are used for configur-ing and retrieving data from the LMS7002M, whereas data endpoints are used forstreaming data to, and from the LMS7002M data converters through the FPGA.The different endpoint addresses and their associated functions in LimeSDR-USBare shown in Table 2.2.

USB endpoints also specify the size of the data packet. For USB 3.0 the max-imum data packet size is 1024 bytes. If the data sent to the USB Controller isgreater than 1024 bytes, it segments the data into multiple data packets, and sendsthem as multiple bursts of data in a single USB Transaction.

Page 32: Timing delay characterization of GNU Radio based 802.15.4

20 CHAPTER 2. BACKGROUND

Endpoint address Function0x01 Stream Data Output0x81 Stream Data Input0x0F Control Data Output0x8F Control Data Input

Table 2.2: LimeSDR USB transfer endpoints

GPIF II The GPIF II is a programmable state machine, that provides the flexi-bility of implementing a custom interface. The GPIF II segments the functionalityinto control and data interfaces. For the data interface, the LimeSDR-USB definesa 32-bit interface running at 100 MHz. The state transitions of the state machineare based on the input control signals from the GPIF II interface. The output con-trol signals are driven by the state transitions of the internal state machine.

LimeSDR-USB implements a slave FIFO interface using the GPIF II interface.The slave FIFO interface allows the FPGA to perform read/ write operations onthe internal FIFO buffers directly. The control interface is used for addressing theinternal FIFOs, and signaling read and write operations. In addition, the controlinterface has flag signals to indicate events to the FIFO slave interface. The 32-bitdata interface transfers the data according to the signaling and addressing usedon the control interface.

The distributed DMA controller uses sockets and threads for allowing datatransfer between the Random Access Memory (RAM) and the peripherals of theCypress EZ-USB FX3. They are explained in the next paragraph to help under-stand the slave FIFO interface.

Sockets and Threads Sockets are connections between a peripheral like GPIF II,I2C, and Cypress EZ-USB FX3 RAM. The microprocessor initializes DMA buffers,which are used for intermediate storage of data. The microprocessor keeps theaddress and size of these DMA buffers in a linked list structure. The elements ofthis structure are referred to as DMA descriptor. The sockets are implemented asa structure with a pointer to a DMA descriptor and interrupt flags.

The sockets can signal other sockets automatically without CPU intervention,or signal the CPU through interrupts. Automatic signaling is used when the mi-croprocessor does not need to modify any data in the data stream. The socketwhich is writing data to the buffer is called producer socket, and the socket read-

Page 33: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 21

ing data from the buffer is called consumer socket according to the terminologydefined by [12].

One producer socket and one consumer socket accessing the same DMA buffercan be encapsulated in a configuration referred to as DMA channel. The DMAchannel can have multiple DMA buffers. Each DMA buffer has its own empty/-full flag to signal the producer, and consumer sockets. For example: if a 1024bytes socket is allocated to the DMA channel, and when the producer socketswrite 1024 bytes, the full flag will be enabled. In case of LimeSDR-USB, twoDMA channels are defined: TX DMA, channel and RX DMA channel. They havea GPIF II socket and USB socket as producer and consumer sockets. Each channelhas multiple DMA buffers, and each buffer is allocated 4096 bytes.

The sockets are internal to the Cypress EZ-USB FX3, so for external devicesto access the DMA buffers, GPIF threads are defined. The GPIF threads connectsockets with external pins. For example: the GPIF threads connect the GPIF IIinterface with GPIF II sockets to implement the slave FIFO interface.

2.2.2 LimeSDR USB Software Architecture

In this subsection, the report explains the driver of the LimeSDR-USB. The soft-ware architecture of the LimeSDR-USB is shown in Figure 2.8.

The streamer block is the main software component responsible for interact-ing with the LimeSDR-USB. LimeSDR defines dataport as an abstract interface,whose methods can be mapped to one specific interface implementation like Pe-ripheral Component Interconnect Express (PCIe), or USB depending on the con-figuration defined by the streamer block. In case of LimeSDR-USB, the dataportis configured to use USB bus. For interacting with application software like GNURadio stream channels are defined. They are FIFOs with control logic for receiv-ing or sending data from the application software. To understand the functional-ity of the streamer block, the report takes a look at the TX data path. The RX Pathis similar in functionality to the TX Path with data flow in the opposite direction.

TX Data Path The application software configures the TX Stream Channels throughthe Lime Application Programming Interface (API). Then the TX FIFOs are ini-tialized and the control parameters like batchsize (shown in Figure 2.9) of the

Page 34: Timing delay characterization of GNU Radio based 802.15.4

22 CHAPTER 2. BACKGROUND

Figure 2.8: LimeSDR USB software architecture

FPGA packets, and the data format of the data to be stored in the FIFO is sent tothe streamer block. The streamer block parses these information and configuresthe hardware data path with the provided configuration. The streamer transmitloop block initializes internal buffers which are used for sending data throughthe dataport.

The application software pushes data to the TX stream channel shown as IQdata in Figure 2.9. The streamer transmit loop runs continuously and once thereis data in the stream channels, it reads the data using the function stream channelread. It converts the data, for example from 32-bit float data to 16-bit integer data,and copies the data to its internal buffers. It then packs the data into a FPGA datapacket format shown as the middle component of Figure 2.9. Each FPGA datapacket contains flags and packet counter field followed by 4080 bytes of data.These FPGA data packets are batched together as specified by the batchsize. Thisbatch is sent to the USB Host Controller to be transmitted to the LimeSDR-USBusing the Linux USB library, libusb.

TX Control Path The LimeSDR-USB is configured at initialization using thelms7 API. The lms7 API provides an abstraction layer for configuring the LimeSDR-USB. It provides the means to control the LMS7002M through the NIOS SPI in-terface. The lms7 API packs the control data into LMS64 control packets shown

Page 35: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 23

Figure 2.9: Streamer Transmit Loop

in Figure 2.10. One LMS64C protocol packet (Figure 2.10) is maximum 64 bytes.If the data to be sent is larger than 64 bytes, the data field is segmented into sev-eral packets. The LMS64C protocol keeps a packet buffer and adds the preparedpacket to this buffer. These packets are then written it to the control endpointusing the dataport.

Figure 2.10: LMS Control Packet Structure

Page 36: Timing delay characterization of GNU Radio based 802.15.4

24 CHAPTER 2. BACKGROUND

2.3 IEEE 802.15.4

IEEE 802.15.4 [14] is a Low-Rate Wireless Personal Area Network (LR-WPAN)specification aiming to be an ultra-low complexity, ultra-low power and low datarate specification. It specifies the PHY and MAC layers for LR-WPANs. It formsthe basis for network protocols like IPv6 over Low-Power Wireless Personal AreaNetwork (6LoWPAN) [16], Zigbee [31], WirelessHart [13]. It is intended to pro-vide a data rate of 250 kbps at 10 meters communication range.

The IEEE 802.15.4 network topologies provide ways in which devices can talkto different nodes in the network. The main two network topologies used in IEEE802.15.4 are: star and mesh. In star topology, all the devices communicate to oneanother through the central device called the Personal Area Network (PAN) co-ordinator. In mesh topology, the devices can talk to one another without the needfor a central coordinator.

These network topologies help in defining the role, and type of device. Re-duced Function Device (RFD) can communicate without routing functionality, sothey usually are the end devices in a network. Full Function Device (FFD) canroute information in addition to regular communication. A coordinator is a spe-cial FFD which sets up the network and acts as the manager of the network.

The IEEE 802.15.4 PHY layer specifies defines different radio channels andmodulation techniques for these network nodes. It specifies Offset QuadraturePhase Shift Keying (O-QPSK), Binary Phase Shift Keying (BPSK), Amplitude ShiftKeying (ASK), Chirp Spread Spectrum (CSS) etc. as physical layer modulationtechniques. The main frequency band of interest is ISM band at 2.4 GHz where16 channels are available globally. The channels are spaced at 5 MHz, having achannel bandwidth of 2 MHz.

The MAC layer offers handshakes for reliability. CSMA with collision avoid-ance, TDMA with synchronization beacons and Guaranteed Time Slot (GTS) aredefined as methods of medium access. There are four different MAC frames de-fined for different unique functions: data, acknowledgement, beacon and MACcommand. Data and acknowledgment frames are used for data communication,whereas beacon and MAC command are used for network maintenance.

The relevant MAC data frame structure has been shown in Figure 2.11. Thefunction of each field is described below:

Page 37: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 25

Figure 2.11: MAC Data Frame Structure

• MHR defines the MAC data frame header. It consists of the following sub-fields:

– Frame Control specifies the type of the frame and how the rest of theframe is structured like the source and destination addressing modes.

– Sequence Number specifies the sequence identifier for the frame

– Destination PAN ID is an unsigned integer which is the unique PANID for the destination node.

– Destination Address is the MAC address of the intended recipient.

– Source PAN ID is the PAN ID of the sender.

– Source Address is the MAC address of the originator of the frame.

• MAC Payload is the link layer frame which includes the data payload.

• FCS is a 16-bit ITU-T Cyclic Redundancy Check (CRC) calculated over theMHR and MAC Payload fields.

Since O-QPSK modulation is being used in the Wime project, the next para-graph elaborates on the O-QPSK modulation.

O-QPSK The O-QPSK PHY specifies the PHY Data Packet as shown in Figure2.12.

• Synchronization Header (SHR) is used for indicating the start of the frame.It is composed of two sub subfields: Preamble and Start-of-Frame Delimiter(SFD). For the O-QPSK, the preamble is four bytes with a decimal value of"0000". The SFD is one byte long as is formatted as "A7" in hexadecimal.

• PHR is the header for the physical layer packet and it specifies the framelength.

Page 38: Timing delay characterization of GNU Radio based 802.15.4

26 CHAPTER 2. BACKGROUND

Figure 2.12: O-QPSK PHY Packet

• PHY Payload is the data payload, which for the purpose of this project isthe MAC data frame. The PHY payload has a maximum size of 127 bytes.

O-QPSK PHY uses 16-ary quasi orthogonal modulation. The data packet isconverted into data symbols. The symbols subsequently are converted to chips,which are modulated onto the carrier signal. The octets of the PHY data packetsare converted into symbols starting from the preamble and ending with the lastoctet of the PHY payload. The least significant bits of an octet is mapped to onesymbol, and the most significant bits mapped to another symbol.

Each symbol is mapped to a 32-bit Pseudo Noise (PN) sequence. These se-quences are related to each other through cyclic shifts and/or conjugation. Theeven indexes of the chip sequence are modulated onto the in-phase componentof the carrier, as shown in Figure 2.13. On the other hand, the odd indexes of thechip sequences are modulated onto the quadrature component of the carrier. Thechips are represented by a half-sine pulse shape. shown in equation 2.1

p(t) =

{sin(pi× t

2Tc) , 0 ≥ t ≤ 2Tc

0 , otherwise(2.1)

2Tc is the period of the half sine wave pulse in equation 2.1. The quadraturecomponent is delayed by Tc as shown in Figure 2.13.

2.4 Wime Project

Wime Project [30] was created with the objective of providing experimentation asa wireless communication system performance evaluation tool. The project fo-cuses on the evaluation of physical layer strategies using Universal Serial Radio

Page 39: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 27

Figure 2.13: In-Phase and Quadrature Chip Sequences

Peripheral (USRP) SDR platform. It provides an IEEE 802.15.4 testbed which isinteroperable with the off-the shelf TelosB nodes running the Contiki operatingsystem as shown by Bastil et al [3].

Wime Project is built on the UCLA 802.15.4 PHY implementation [22] de-signed by Thomas Schmid. It ports the existing implementation to the modernGNU Radio software framework. Furthermore, it extends the PHY implementa-tion with a MAC layer and adds Rime Stack as a network layer. The implemen-tation details of the PHY layer and the MAC layer is described below.

• Modulation The Wime MAC works with asynchronous messages, whereasthe PHY is implemented with stream interfaces. The GNU Radio modula-tion flow graph is shown in Figure 2.14. At first, these asynchronous MACmessages are converted to a tagged stream by the PDU to tagged streamblock.

Figure 2.14: GNU Radio Modulation flow graph

The individual octets of the stream are then converted to symbols as de-scribed previously in the O-QPSK paragraph. Instead of converting the

Page 40: Timing delay characterization of GNU Radio based 802.15.4

28 CHAPTER 2. BACKGROUND

symbols to chips directly, and then segmenting into in-phase, and quadra-ture components as in UCLA PHY. Wime Project implementation directlyconverts a symbol through a symbol table into a sequence of complex in-phase and quadrature values. Each symbol input results in 16 complexchips. These individual chips are interpolated by repeating a single chipfour times by the repeat block. The repeated chips are then multiplied witha sine function at different phases, differing by π/4. This process makes asingle chip to be four samples wide. The quadrature component is delayedby two samples to satisfy the offset requirement (Tc).

Figure 2.15: GNU Radio Demodulation flow graph

• Demodulation The data stream from the USRP is passed through a quadra-ture demodulation block as shown in Figure 2.15. It does FM demodula-tion of the received signal stream. Clock recovery block takes this demod-ulated signal stream and performs decimation first, followed by Muellerand Müller discrete-time error-tracking synchronizer. The clock recoveryblock outputs the chip sequences obtained from the incoming signal stream.These chip sequences are then sliced, and converted to symbols by thepacket sink block. The decoded valid message is then outputted to theWime MAC.

• MAC The Wime project adds a very simple MAC which encapsulates up-per layer packets with the MAC header, and adds the FCS. When receivingmessages from the PHY, it does a validity check on the received message.On successful validation of the received message, it strips away the MACheader. The MAC layer transmits the message immediately without anycarrier sensing.

2.5 Tools Used

In this section, the report provides a very short introduction to the tools usedduring the project.

Page 41: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 2. BACKGROUND 29

2.5.1 USBMon

It is a kernel facility provided to collect I/O traces on the USB Bus [28]. USBMonreports the requests made to and by the USB Host Controller Drivers(HCD). Theoverall software architecture is shown in Figure 2.16. It provides two kinds ofAPIs: binary and character. The binary API is accessed by character devices lo-cated in the Linux devices namespace. The character API provides human read-ability, and uniform format for the traces. The kernel data from the USBMon textdata is made available to the userspace using debugfs [7] utility.

Figure 2.16: USBMon Architecture(Adapted from [2]).

2.5.2 Pidstat

In order to quantitatively evaluate the impact of processing resources, we usepidstat. Pidstat is a Linux tool provided by the sysstat package. It can beused to monitor the resource usage of a Linux process. It can be configured tocollect information on CPU utilization, memory utilization, context switches, and

Page 42: Timing delay characterization of GNU Radio based 802.15.4

30 CHAPTER 2. BACKGROUND

the time used by the process at application level and kernel level. It collects theinformation at specified time intervals, and the time duration for one collectioncan be specified during the process invocation.

Page 43: Timing delay characterization of GNU Radio based 802.15.4

Chapter 3

Literature Study

The previous chapter gave a theoretical understanding of SDR system, GNU Ra-dio, LimeSDR and IEEE 802.15.4 to provide necessary background informationfor better understanding of the report. To build on top of that, this chapter in-troduces previous relevant research work. As there is no prior work done on theLimeSDR-USB platform, all the research work investigated will be using USRP asthe SDR platform. The previous work is analyzed to figure out the methods usedfor the measurement of the timing delays. The results of each of these works areanalyzed to find the performance bottlenecks. Finally, previous work is exploredfor mitigation strategies.

Schmid et al [23] focused on characterizing the latency, and its impact onthroughput for modern protocols like IEEE 802.15.4 in a Host-PHY architecture.The work introduced the problem of blind spots and the impact of bus transfer la-tency. It used an external oscilloscope with one channel connected to the parallelport of the host computer and the other channel connected to one of the RF portsof the USRP as the measurement setup. It concentrated on analytically modelingthe bus latency, and assumed the rest of the delay is introduced by the softwareprocessing. A key takeaway from the results is that USB transfer time dependson the USRP buffer size specified at system initialization. The bus latency is sig-nificant at lower sampling rates as it takes more time to fill up the USRP buffers.However, at higher sampling rates, the bus latency is negligible compared to theprocessing delay.

Nychis et al [20] argued for the need of a split-MAC approach, where time-sensitive MAC operations are moved to the FPGA which are controlled by thehost computer. Since this project concentrates on timing delay characterization,

31

Page 44: Timing delay characterization of GNU Radio based 802.15.4

32 CHAPTER 3. LITERATURE STUDY

the discussion is only limited to the relevant delay analysis.

The Split-MAC approach was motivated by precise time information of de-lays at two levels: kernel and userspace, and kernel and FPGA. Timestamps wereintroduced at different points of the TX and RX Chains to quantitatively measurethese delays. For the Kernel to FPGA time, the work modified the kernel’s USBdriver and measured the time at the last point, before the DMA Write Request, orafter DMA read request interrupts the driver. USRP ping command was used forthe measurement of the overall round trip time. It is important to note that thismeasurement setup did not use the radio front-end of the USRP. So the reportedbus transfer time is not controlled by the sampling rate, as shown by Schmid etal [23]. Even in that case, the bus transfer time was quite significant. The workfurther modified the USB transfer size to 512 bytes from the default 4096 bytes.The work concluded that USB transfer setup time is quite significant as this mod-ification led to the reduction of Kernel to FPGA only by a factor of two. Although,kernel to FPGA time contributed significantly to the overall latency, it contributeda limited amount of jitter in their results. On the other hand, GNU Radio process-ing had a high standard deviation.

Truong et al [26] investigated and analyzed the different sources of delays inan USRP Embedded E 100 SDR platform. The USRP E series has an embeddedprocessor which allows it to operate it in a standalone mode. Instead of commu-nicating with a host computer through a communication bus, the E series uses aGeneral Purpose Memory Controller (GPMC) controller for connecting the em-bedded memory, and the FPGA buffers. The measured latency was segmentedinto three parts: software, bus, and hardware delays. Software delay is definedas the delay introduced by the software buffering scheme in GNU Radio andother host computer processes involved. Bus Delay is the delay introduced bythe buffering in the USRP Hardware Driver (UHD). The hardware delay is mainlycaused by buffering of data in the FPGA First In First Out (FIFO) buffers whichis proportional to the USRP sample rate. Similar to Nychis et al [20], this workmeasured overall round-trip times, as well as individual component delays usingtimestamps at different steps. They used ping command to evaluate the overalllatency. A GNU Radio flow graph was used for the timestamp method. Onething to note is that the results are computed over 873 out of 5000 ping messagessent. Finally, the work showed the impact of UHD buffer size on latency, withlower buffer size leading to a lower mean, and standard deviation in the mea-sured latency. The work concluded host computer processing time is the mainlatency bottleneck, as they are using an embedded processor.

Page 45: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 3. LITERATURE STUDY 33

Puschmann et al [21] developed a Send-and Wait Automatic Repeat reQuest(ARQ) protocol testbed using USRP 2, as the SDR platform. They evaluated thetestbed by measuring end to end throughput and latency. In this work, pingcommand was used for measuring the round trip times at the data link layer andthe application layer. They investigated the impact of sample buffer size in theUSRP 2 driver on the round trip times. Lower sample buffer sizes led to lowerround-trip times, with less jitter as the received samples did not need to wait un-necessarily in the queues.

As highlighted by all these previous works, there is significant latency in SDRplatform based communication systems. Previous works have tried to showcasedelays in different segments in the processing chain, but comprehensive evalua-tion is missing. The literature review showed that there is a lack of knowledgeabout the latency of recent state of the art platforms like LimeSDR. One of the pur-poses of this project is to fulfill this knowledge gap. Finally, most of the workstried to mitigate the buffering delays in the UHD by modifying the UHD samplebuffer size. Since LimeSDR implements a different software architecture alterna-tive mitigation techniques need to be developed.

Page 46: Timing delay characterization of GNU Radio based 802.15.4

Chapter 4

Timing Characterization of the LimeSDR-USB platform

This chapter introduces the methodology and system architecture, followed bythe experimental designs for the quantitative analysis. The system architecturedescribes the GNU Radio flow graph description, software and hardware used inthis project. In the first experiment, the performance of the system is measuredwith respect to broader parameters like sampling rate, and data payload size. Thesecond experiment was designed for evaluation of the impact of these parameterson the different subsections of the data path. In experiment 3, we look at theimpact of the USB transfer size on the overall delay. In Chapter 5, we are goingto refer to the results from these experiments for the selection of our mitigationmethods.

4.1 Methodology and Literature Reflection

This project uses a quantitative experimental design method. The characteriza-tion of timing delays needs to have quantitative measurements as they wouldhelp compare it to the specified standards and define the limitations concretely.This project concentrates on designing experiments for understanding the causaleffect of system parameters on the measurements.

As this project aims for timing characterization of signal processing chains, amethod was needed for measuring the time duration of the individual processes.The LimeSDR does not have a real time clock on board, so in order to measuretime on the LimeSDR, synchronization of different clock domains across a vari-able latency link was needed. This process is complex, and not feasible given the

34

Page 47: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM35

time constraint. Therefore, this project avoids doing any time measurement onthe LimeSDR. Considering the methods used for delay characterization from theliterature review, this project opts to timestamp the SDR software processes onthe host computer. This method is reliable and does not affect the measurementssignificantly.

Most of the previous studies used ping as the measurement tool. In this case,there are two host computers connected through two SDR platforms. As the hostcomputers will not be identical, they will impact the collected measurement dif-ferently. It would also not be possible to evaluate the impact of the host computerprocessing resources in this case. Another problem with this method is the useof common frequency bands. Communication among other devices can make thesetup unreliable. As this project concentrates on the performance of the LimeSDRplatform without considering the radio channel conditions, the impact of com-munication through the air medium can be safely ignored. Finally, measurementacross two different platforms makes it difficult to correlate the measurements onthe two computers, as the clocks are not synchronized. Considering these factors,this project picks the loopback method. In this method, the TX and RX chains onthe LimeSDR are shorted before the LNA of the LMS7002M chip shown in FigureA.2. This method uses the radio front-end of the LimeSDR, which helps evaluatethe impact of radio front-end configuration on the performance.

4.2 Experimental Setup and Implementation

This section introduces the details of our experimental setup. The implementa-tion aspects of the experimental setup are discussed after the description of thesetup.

4.2.1 Experimental Setup

The physical experimental setup is shown in Figure 4.1 to provide a broader pic-ture of the setup. The host computer is running the GNU Radio communicationsystem which interacts with the LimeSDR platform through USB 3.0 connection.The LimeSDR has been configured to run in the loopback configuration.

The software flow diagram for the experimental setup is shown in Figure 4.2.This diagram in conjunction with the data flow diagram (Figure 4.3) gives thecomplete picture of the experimental setup. In Figure 4.2, the GNU Radio runs

Page 48: Timing delay characterization of GNU Radio based 802.15.4

36 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

Figure 4.1: Measurement Setup

the communication system which is primarily based on the Wime project imple-mentation of the 802.15.4 protocol. A message source block, described in Section4.2.2, is added to help generate a controlled excitation signal for evaluating theexperimental setup. It generates the payload for the MAC layer of the Wime802.15.4 MAC, which adds the MAC header, and the CRC for the MAC packet.The Wime 802.15.4 PHY adds the preamble, and the length fields for the phys-ical layer packet. This physical layer packet is modulated through the O-QPSKmodulation. The GNU Radio processes these samples in chunks, the chunk sizeis defined by the block’s scheduler. The samples generated by this GNU Radioprocess is passed to the LimeSDR driver which does the process described in Sec-tion 2.2.2. It generates the FPGA packets in batches, with the batch size definedat initialization. The formatted packets are forwarded to the Linux USB library,libusb, which generates the USB transaction. Finally, the host controller transfersthe data to the LimeSDR. Each USB transaction has a token defining the directionof the transfer, followed by the bursts of USB data packets. The size of the USBdata packets is defined by the USB descriptor of the LimeSDR Cypress EZ-USBFX3 during device registration. The LimeSDR platform was configured to workin loopback mode, with the TX and RX chains on the LimeSDR shorted. Once thedata moves to the RX Path of the LimeSDR FPGA it is transferred through the Cy-press EZ-USB FX3 to the host computer. It moves from the USB host controller to

Page 49: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM37

Figure 4.2: Experimental Setup Flow Diagram

the GNU Radio RX flow graph in the reverse order of the TX data flow describedbefore.

Hardware and Software Used The project uses two host computers in orderto evaluate the impact of the host computer processing resources on the charac-teristics of the time delays. We refer to the two host computers as ’laptop’ and’desktop computer’, the hardware specifications for these are shown in Table 4.1.The desktop processor has better processing resources, because of higher CPUclock speed, and two more CPU cores. It uses Ubuntu 16.04.5 LTS as the operat-ing system whereas the laptop uses Elementary OS built on Ubuntu 16.04.5 LTSas its operating system.

Page 50: Timing delay characterization of GNU Radio based 802.15.4

38 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

Figure 4.3: Dataflow Diagram

Resource Laptop Desktop ComputerCPU Intel R© CoreTM i5-4300U CPU Intel R© CoreTM i5-3470 CPU

CPU Clock Speed 1.9 GHz 3.2 GHzCPU Cores 2 4

CPU Threads 4 4RAM 7,9 GB 15.6 GB

Table 4.1: Hardware Specifications

To ensure that the experimental setup on the two host computers are as similaras possible, we configure them with the same software configuration, which is

Page 51: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM39

shown in Table 4.2.

Linux Kernel Linux 4.15.0-24-genericGNU Radio version 3.7.12.0LimeSuite version 18.06.1-g58ab1c3c

LimeSDR GW version 2.17sysstat version 11.2.0-1ubuntu0.2

Table 4.2: Software Specifications

4.2.2 Implementation

The Wime project implementation needed to be modified to use LimeSDR insteadof USRP as the SDR platform. Furthermore, the entire experimental setup neededsome alterations to facilitate the experimental process. These adaptations can begrouped as:

1. GNU Radio Source and Sink blocks

2. 802.15.4 PHY layer

3. Periodic Message Source

GNU Radio Source and Sink Blocks The USRP source and sink blocks are re-placed by the gr-limesdr source and sink blocks for using the LimeSDR plat-form. Another alternative used in this project is the gr-osmosdr sink and sourceblocks which used soapysdr libraries to access the Lime API. The former waschosen as it directly interacts with the Lime API without using the adaptationlayer presented by the soapysdr project. This gives much better control overthe board control parameters, and also saves multiple memcpy operations usedby the soapysdr adaptation layer.

In both these blocks, the loopback configuration is defined at initialization bysetting the loopback register on the NIOS processor. This turns on the switch atthe end of the LNA which creates a data flow from the TX path to the RX datapath on the LMS7002M.

LimeSDR driver sends USB packets in bursts whose size is determined by theFPGA data packet size and the batch size of FPGA data packet defined for thestream. In case, the GNU Radio packet data size is less than the size of a burst,

Page 52: Timing delay characterization of GNU Radio based 802.15.4

40 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

the LimeSDR driver waits for more data from the GNU Radio. It pads zeros atthe end of the original data and sends the burst data after a set timeout.

Figure 4.4: Radio Silent Problem

This implementation works well for continuous stream based protocols, butfor packet based protocols like the IEEE 802.15.4, this leads to radio silence prob-lem, as illustrated in Figure 4.4. In this figure, an example LimeSDR transmissionover the air is shown. If the data contained in one packet does not satisfy the sizerequirement imposed by the LimeSDR driver, the radio becomes silent (shown asthe gap in the figure), and the transmission breaks down. The receiver assumesthat the radio signal for a packet is a single continuous stream, so if the radiosignal is interrupted, then it can not receive the packet properly.

This project adds a GNU Radio stream tag called "End of Packet" to the endof sample stream from the 802.15.4 PHY block for solving this problem. In thegr-limesdr block, the tag instructs the LimeSDR driver to immediately sendthe data to the LimeSDR by appending zeros at the end of the packet data.

802.15.4 PHY layer The Wime project PHY layer has been designed to only workwith 4 MHz as the sampling rate, in this project, the PHY layer has been modifiedto accommodate different sampling rates as it helps in understanding the impactof sampling rate on the performance.

On the TX flow graph of the GNU Radio 802.15.4 PHY, a sine wave generatoris added which takes the desired frequency as input. This desired frequency isreferred to as setfrequency. The repeat block in the GNU Radio TX flow graph

Page 53: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM41

interpolates a single chip by repeating it setfrequency/1MHz times as the mod-ulator generates a signal of bandwidth 1 MHz. On the RX flow graph, the clockrecovery module decimates by setfrequency/2MHz as the received signal shouldhave a bandwidth of 2 MHz.

Periodic Message Source This project needed a controllable excitation sourcefor evaluating the experimental setup. A periodic message source block was im-plemented in the GNU Radio, it takes the message length and time period asparameters. Figure 4.5 shows the working of the message source with respect totime. The time period controls the time duration between two subsequent mes-sage generation. The data length controls the data payload size of the 802.15.4MAC layer packet. It indirectly controls the transmission time, ∆Ttx , of themessage. The larger the message payload, the more time it will take to trans-mit through the setup.

Figure 4.5: Periodic Message Source

4.3 Metrics

Before going into the details of the experiments, we need to specify our met-rics, their definition and the measurement methods used in this project. Previ-ous work highlighted in the literature review have used round-trip times, andbroader component delays to understand the delays in their experimental setup.Considering this, we decided to measure the round-trip time and the componentdelays, but our definition and measurement techniques differ from those of pre-vious work. Firstly, previous work used a setup with two SDRs where they used

Page 54: Timing delay characterization of GNU Radio based 802.15.4

42 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

ping command to measure the round trip time. As in this project, we are inter-ested in the measurement of the delays in isolation, we decided to measure thetime taken from the transmission of a message to the reception of a message. Forthis, we specify the first metric as latency. It helps us to measure the time takenby the entire data flow path. Secondly, previous work has used coarse-grainedmeasurement of the delays contributed by different components. In this project,we focused on more granular component delay measurement. This helps us tofigure out the bottlenecks in this system architecture and to comprehensively fillup the knowledge gap discovered by the literature review. Hence, we specify oursecond metric to be component delays. In the following subsections, we describethese two metrics and the measurement methods.

4.3.1 Latency

In general, latency is defined as the time it takes a packet to reach the receiverfrom the transmitter. In this project, the transmission time is defined as the timethe packet enters the 802.15.4 MAC. The reception time is defined as the time theSFD is detected in the received packet by the 802.15.4 PHY packet detector.

This latency definition is asymmetric, as it ignores the processing time of re-ceived packers in the MAC layer. So all the reported values are lower than thethose for MAC layer latency. This is done to eliminate the impact of data payloadsize on the latency measurements, as the study concentrates mainly on the timedelay analysis.

The periodic message source notes down the global system time as T1 whenit publishes a message to its output port. This value is written to a file whenthe message source receives the payload from the MAC, this ensures that timefor data packets successfully decoded by the MAC is only noted. The packetdetector in the RX flow graph detects the SFD, and notes down the time as T8.When the entire packet is successfully decoded by the packet detector, it writesthis time to another file. The decoding of the entire packet makes sure that theprogram only writes the time for a physical layer packet not whenever it detectsthe preamble sequence.

Since the time noted should be compared with those from USBMon, clock_-gettime was selected as the preferred method. clock_gettime is a functionprovided by Linux time library to measure the system time in nanoseconds. Wetruncate the value to microseconds as USBMon returns USB packet timestamps

Page 55: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM43

in microseconds. The difference (T8−T1) will give the latency as per this project’sdefinition. As two different processes are measuring the timestamps, the correctT1 and T8 needs to be grouped together for the latency measurement. The follow-ing algorithm is used for this purpose.

Algorithm 1 Time Data CorrelationT ← Time Period of Message Sourcel←min(length of time arrays)i← 0

for i < l doif (t1[i] > t8[i]) then

delete t1[i]

else if (t8[i]− t1[i]) > T then delete t8[i]else i← i+ 1

l←min(length of time arrays)end if

end for

The algorithm assumes that the time period of the message source is greaterthan the latency. It is enforced through the design of our experiment by configur-ing the message source with significantly higher time period than the latency.

4.3.2 Component Delays

Granular timestamps are necessary for quantitatively analyzing the results. Thisproject defines seven different time delays for this purpose.

1. GNU Radio TX Processing Delay This provides the delays incurred to gen-erate the sample stream of the modulated packet.

2. LimeSDR TX Driver Delay The time it takes the driver to pack the datainto batches of FPGA data packet can be an important metric to show if thedriver needs a closer look for optimization.

3. User Space to Kernel Space Delay This delays gives an idea of how theLinux process scheduler impacts the performance.

4. LimeSDR loopback Delay The time taken by the bus transfer, the buffer-ing in the LimeSDR FIFOs and the hardware delays provides an insight on

Page 56: Timing delay characterization of GNU Radio based 802.15.4

44 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

how the transfer of data from host computer to the LimeSDR and vice versaimpacts the performance. In this case, the project assumes the hardwareprocessing delays are negligible and most of the time is contributed by thebus transfers and buffering delays.

5. Kernel Space to User Space Delay It focuses on the impact of linux schedul-ing.

6. LimeSDR RX Driver Delay It provides the time needed to unpack theFPGA data packets and pass them to the GNU Radio flow graph. It alsoincludes the buffering time in case the GNU Radio RX flow graph is unableto process the samples in real-time. By real-time, this project means the datacannot be processed at the rate they are being generated.

7. GNU Radio RX Processing Delay It provides the delays introduced by theRX flow graph in GNU Radio.

4.3.2.1 Component Time Delays Measurement

With the need for fine tuning the delay measurement highlighted, this report con-centrates on providing the implementation details of the measurement setup.

The measurement setup for this experiment is shown in Figure 4.6. This setupis more or less similar to our initial setup, with the timestamp data points enumer-ated as per the delays mentioned previously. The project concentrated on findingthe last execution statement applied to the data in each component, shown inFigure 4.6, using static code analysis. This method is only valid for the userspacecomponents as the source code for them is easily accessible. The method for mea-surement of the timestamp from the USB Host Controller in kernel space is morecomplex and will be covered in the next paragraph.

The timestamps, their corresponding execution calls, and how they relate tothe delays defined previously have been shown in Figure 4.7. Starting with theTX path, we have the time the message enters the MAC layer of the 802.15.4 asT1. The gr-limesdr block hands over the data to the LimeSDR driver usingthe method LMS_SendStream. So a timestamp is taken just before the LMS_-SendStream call which is referred to as T2. The difference (T2 − T1) gives theGNU Radio TX Processing Delay. The LimeSDR driver uses the ConnectionFX3module for interacting with the LimeSDR. The connectionFX3 sends the datato the LimeSDR using the method libusb_submit_transfer of the libusb li-brary. The time instant of calling this method is noted down as T3. The difference

Page 57: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM45

Figure 4.6: Component Time Delay Measurement Setup

(T3 − T2) gives the time taken by the LimeSDR driver on the TX Path. When thedata is sent to the USB Host Controller using URB_submit from the kernel USBdriver, we measure the time of the first transfer of an 802.15.4 TX data packet asT4. The difference (T4 − T3) gives us the time it takes for the data to be sent to theUSB host controller is referred to as User Space to Kernel Space Delay.

Coming to the RX data path, the time for the first transfer of a 802.15.4 RXpacket data from the USB Host Controller to the kernel USB driver is noted as T5.The LimeSDR loopback delay can be measured from the T5− T4. When the Con-nectionFX3 module in the LimeSDR driver receives a USB transfer, the times-

Page 58: Timing delay characterization of GNU Radio based 802.15.4

46 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

Figure 4.7: Overview of the timestamps.

tamp T6 is taken. The kernel space to user space delay is taken as the difference(T6 − T5). T7 is measured when the gr-limesdr block receives the data usingLMS_ReceiveStream method. The difference (T7− T6) gives us the delay intro-duced by the LimeSDR driver on the RX data path. Finally, we have T8 from ourprevious experimentation, which provides the difference (T8 − T7) as the GNURadio RX processing time.

4.3.2.2 Measuring T4 and T5

In order to ensure that the LimeSDR loopback delay is as accurate as possible tothe delay contributed by the buffering and hardware delays, this project picksUSB host controller as the source for measurement of T4 and T5. The measure-

Page 59: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM47

ment method is made difficult by this choice. Firstly, modifying the USB hostcontroller code is complicated. Secondly, the measurement value of T4 and T5should be for only 802.15.4 packets, and not all TX and RX data, as it would pro-vide the bounds for our data correlation.

The project uses the USBMon kernel utility to monitor the urb_submit andurb_request calls to and from the USB host controller. Each URB has an asso-ciated timestamp which is generated by the kernel USB driver and the USB hostcontroller for submit and request respectively. The project uses the binary inter-face of the USBMon main utility as the text interface provides only 64 bytes ofURB data which is not suitable for our use case. Once the events have been re-ceived from the USBMon event queue, it necessary to find the relevant USB trans-fers in these events. This project adopts an offline processing approach, where aseparate program "Timing Measurement Program" (shown in Figure 4.6) collectsall the events, filters out the unnecessary event data, and writes the remainderto files. This approach is chosen as the runtime data processing approach addsprocessing overhead which will impact the component delays. Furthermore, thisapproach provides us the USB Transfer data, which can be used for further pro-cessing if necessary.

Timing Measurement Program The "Timing Measurement Program" can be ex-plained using flowchart as shown in Figure 4.8. This program uses ioctl to ac-cess the /dev/usbmonX character device which allows it to access the USBMonkernel utility event queue. The events are filtered to find transfers with data end-points, 0x01 and 0x81 (shown in Table 2.2), which are written to TX data file andRX data file respectively. The data is buffered in memory before writing to a fileas it results in less frequent calls to the file write operation. A data structure isdefined for helping the analysis program to parse through the written data files,which is shown in Figure 4.8. The structure includes an eight bytes counter, whichshows the sequence number of the structure, it is used primarily for debugging.The URB timestamp is added to the structure. It is subdivided into eight bytes forthe seconds value of the timer, and four bytes for fractional seconds, which pro-vides the timer value in microseconds. The timing measurement program stripsaway the LimeSDR FPGA data packet header and clubs all the data in one USBtransfer in the data field of this structure. The default FPGA packet contains 4080bytes of data, so the amount of data written will be batchsize × 4080, which isshown in Figure 4.8

Page 60: Timing delay characterization of GNU Radio based 802.15.4

48 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

Figure 4.8: Timing Measurement Program flow chart

Figure 4.9: USB Data Analysis Program Structure

USB Data Analysis The data structures written to the TX data file and RX datafile needs to be analyzed to find the timestamps T4 and T5 respectively. Sincethis project uses analog loopback, the TX sample value changes as it is converted

Page 61: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM49

to analog and again sampled to digital RX samples. For this reason, the samplevalues in the TX and RX USB packets cannot be compared to find when the samesample is returned. Hence, this project uses cross-correlation to match the TX andRX USB transfers and find the time shift of the RX samples from the TX samples.

The structure of the program written for this purpose is shown in Figure 4.9.The program starts off by reading the TX data file, it parses the data structure tofind the data samples. It writes the data samples to the TX data array when theamplitude of some samples is greater than 0.5. This value is chosen as a thresholdto check for relevant data after experimental observation. The URB timestamp inthe structure is written to the TX timestamp array. The program then writes thesubsequent structures to a data array and timestamp array until the URB times-tamp exceeds the first timestamp by 500 µs. Again, this value is chosen by ex-perimental observation. The first URB timestamp in the TX timestamp array ischosen as T4, as it is the timestamp for the first relevant TX USB transfer.

The RX data file is parsed for a USB transfer which was received just after T4.The data of the structures containing this transfer and subsequent transfers arewritten to the RX data array, and the URB time stamps are written to the RX times-tamp array. Once the data has been structured in this array, they are convertedto complex data samples using the data conversion method of the LimeSDR RXdriver. The complex data samples are then cross-correlated with the TX dataarray, and the sample index for the maximum cross-correlation value is found.When the first TX sample is perfectly aligned with the first RX sample it will givethe maximum cross-correlation value. The maximum cross-correlation sampleindex gives the sample index of the first relevant RX sample as the TX samplesdo not have any initial shift. This is because of the previous structuring. Thisprocess is expressed mathematically using Equation 4.1 and Equation 4.2.

Now, the URB timestamp for that sample is found by the following equations

RXTimeindex =argmax(TXdata ? RXdata)

N ×Batchsize(4.1)

T5 = RXTimeArray[bRXTimeindexc] (4.2)

where ? is cross-correlation operator, N is the number of samples in one LimeSDRFPGA packet (default value is 1020 samples) and b.c is the mathematical floorfunction.

Page 62: Timing delay characterization of GNU Radio based 802.15.4

50 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

Equation 4.1 provides the index for the RX timestamp array, which is used togenerate T5, the URB timestamp for the first relevant RX USB transfer.

4.3.2.3 Data Correlation

T2, T3, T6 and T7 are measured for all the calls to their corresponding executionstatements. As the LimeSDR is continuously streaming data, the timestamps forthe relevant execution calls needs to be found out. This problem has been high-lighted for the RX data path using Figure 4.10.

Figure 4.10: TimeStamp Correlation problem

From the "Timing Measurements Program" we derive T5, which is the time forthe first relevant RX USB transfer when a message is loopbacked. We also knowT8 from the GNU Radio. However, the continuous stream of RX samples fromthe LimeSDR results in multiple calls to the libusb and driver timestamps mea-surements shown in Figure 4.10 as TDx and TUx. To measure the correspondingdelays we need to figure out which TDx and TUx should be picked given T5 andT8.

This project avoids doing this check at runtime at each component as it isdifficult to find the relevant data in a sample stream, and also it adds unnecessaryoverhead which will impact the measured delays. We use the knowledge that thetimestamps should be such that T8 > T7 > T6 > T5, to find all possible valuesof T6 and T7 which satisfy this condition. We defined T5 as the time the first RXUSB packet is detected, so we take the first value of T6 and T7 which satisfies

Page 63: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM51

the condition. This process has been explained using algorithm 2. The algorithmassumes that the T5 value is correct, it then searches for T6 that is greater than T5because of the condition defined before. Once T6 has been determined, it looksfor T7 that is just greater than T6. Similarly, T8 greater than T7 is found. Onelimitation with this measurement is that we cannot correlate all the timestamps.This process is similar for the TX data path.

Algorithm 2 Data CorrelationT ← Time Period of Message Sourcel←min(length of time arrays)i← 0

j ← 0

k ← 0

m← 0

flag ← 0

for i < l dowhile (T5[i] >= T6[j]) do

j ← j + 1

end whilewhile (T6[j] >= T7[k]) do

k ← k + 1

end whilewhile (T8[m] >= T7[k]) do

m← m+ 1

end while∆GNURadio[i]← T8[m]− T7[k]

∆Driver[i]← T7[k]− T6[j]∆Kernel[i]← T6[m]− T5[i]

end for

4.4 Experiment 1: Impact of network parameters onlatency

The first experiment is designed to evaluate the impact of network parameterssuch as data payload size and the sampling rate of the protocol on the the latency.The impact of the amount of data sent to the LimeSDR and the impact of highersampling rate on the latency needs to be evaluated.

Page 64: Timing delay characterization of GNU Radio based 802.15.4

52 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

• Objective These parameters would help quantitatively analyze if this sys-tem design can also be applied for running higher bandwidth protocols likeIEEE 802.11ac. Another goal of performing this experiment is to validatethe timestamp method for the latency analysis. In the ideal case, the latencyshould not be dependent on the amount of data or the sampling rate.

• Input Parameters The experiment has two input parameters:

1. The length of the messages generated by the periodic message sourceblock. Since the PHY payload for 802.15.4 should be less than 128bytes, the message source data length plus MAC header should be lessthan that. The maximum message source data length comes to be 112bytes with MAC header of 15 bytes. We decided to spread the messagesource length across the entire available range from one to 112. Wechoose four different message sizes: 1 byte, 37 bytes, 74 bytes, and 112bytes.

2. The sampling rate defines the interpolation and decimation ratio in the802.15.4 physical layer. It also configures the FPGA sampling clock.More samples need to be processed for higher sampling rates. Thedefault sampling rate for 802.15.4 is 4 MHz, this project also adds 8MHz and 16 MHz as sampling rate to get an idea on the impact of aprotocol’s bandwidth requirements on performance.

• Output Metric The latency which is defined in section 4.3 as T8 − T1.

• Design of the experiment Firstly, the time period of message source wasset to 500 ms to ensure that the individual latency measurement is indepen-dent of each other. The LimeSDR FPGA data packet batch size is set to one.A python script was created to automate the process, which runs the pro-cess for 100 seconds. The message source generates 199 packets during thisduration which gives us 199 latency measurements.

4.5 Experiment 2: Analysis of component delays

This experiment aims to find out the delays in different software and hardwarecomponents in the system architecture shown in Figure 4.6.

• Objective This experiment will provide a comprehensive understanding ofthe system and how different input parameters and host computer configu-ration affect the system performance. Furthermore, this understanding willhelp future delay mitigation work.

Page 65: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USB PLATFORM53

• Input Parameters The experiment has been designed to investigate the im-pact of message sizes. As the default sampling rate for IEEE 802.15.4 is 4MHz, it is chosen as the sampling rate for this experiment. We chose threemessage sizes for this experimenter: 1 byte, 56 bytes, and 112 bytes.

• Output Metrics The components delays are the output metrics for this ex-periment. The component delay and their relation with timestamps definedpreviously is summarized below (shown graphically in Figure 4.7).

– GNU Radio TX Processing Delay: T2 − T1– LimeSDR TX Processing Delay: T3 − T2– User Space to Kernel Space Delay: T4 − T3– LimeSDR loopback Delay: T5 − T4– Kernel space to User Space Delay: T6 − T5– LimeSDR RX Processing Delay: T7 − T6– GNU Radio RX Processing Delay: T8 − T7

• Design of the Experiment The time period of the message source is againset to 500ms. The LimeSDR FPGA packet batchsize is set to one as in Exper-iment 1. As some of the collected time samples cannot be correlated becauseof the limitation of our data correlation program the duration of the experi-ment is increases to 5000 seconds, which results in 999 packets. This resultsin 999 time samples for our analysis.

4.6 Experiment 3: Impact of USB Transfer Size onthe component delays

It is necessary to understand how the amount of data in one bus transfer affectsthe overall latency and the LimeSDR loopback delay as most of the previousof the previous works highlighted that bus transfer delays are quite significant.However, the study of the impact of bus transfer size on the latency is missing inthese previous works. An experiment was designed for this purpose by varyingthe batchsize of the LimeSDR FPGA data packets.

• Objective This experiment is designed to understand the impact of USBtransfer sizes on the latency and component delays in the experimentalsetup shown in Figure 4.6.

Page 66: Timing delay characterization of GNU Radio based 802.15.4

54 CHAPTER 4. TIMING CHARACTERIZATION OF THE LIMESDR-USBPLATFORM

• Input Parameters The LimeSDR uses a parameter to set the batchsize ofthe LimeSDR FPGA data packets depending on the sampling rate, ensuringdata from higher sampling rates configurations do not overflow the buffers.By varying this parameter we select three different batchsizes: one(minimumpossible), four(default) and eight(maximum possible).

• Output Metric The latency and component delays are measured as definedin Section 4.3.

• Design of Experiment The experiment collects 999 latency and componentdelays over 5000 seconds. The time period of the message source is set to500ms, the message source generates 56 bytes data payload.

Page 67: Timing delay characterization of GNU Radio based 802.15.4

Chapter 5

Mitigation Methods

This chapter introduces our mitigation strategies for lowering the overall latency.We decided to focus on decreasing the LimeSDR loopback delay and the GNUradio processing delays as it is evident from our results that they are the majorcontributors to the overall latency. The chapter first investigates the reasons forthe LimeSDR loopback delay, followed by discussion of our method of decreasingthe LimeSDR FPGA packet size. We then discuss our strategy for modifying theGNU radio scheduler parameter for specifying the maximum number of dataelements processed in one execution of each block. The design of the experimentfor evaluation of our strategies is discussed in the last section.

5.1 LimeSDR loopback delay

We found the LimeSDR loopback delay is quite significant in our results fromExperiment 2. The software processing delays are also significant but the differ-ence in results across the two host computers provides us with enough reasonto argue that these delays are because of limited host computer processing capa-bility and can be mitigated to some extent by upgrading the host computer. TheLimeSDR loopback delay is constant across both the host computers and thus canbe classified as a property of the LimeSDR-USB platform. We decided to focus onthe LimeSDR loopback delay as it is a fundamental delay of the platform and willaffect the performance of any system designed using the LimeSDR-USB platform.

In order to have a closer understanding, we segment the LimeSDR loopbackdelay into LimeSDR platform delays and bus transfer delay. USB 3.0 providesdata bandwidth of 4 Gbps, and if the entire bandwidth is available for the datatransfer, it would take 4096∗8

4Gbps= 8.192µs for a single LimeSDR FPGA packet of 4096

55

Page 68: Timing delay characterization of GNU Radio based 802.15.4

56 CHAPTER 5. MITIGATION METHODS

bytes. This is much smaller than the overall LimeSDR loopback delay and can besafely ignored. Therefore, we can approximate the LimeSDR loopback delay asthe LimeSDR platform delay, which can be decomposed into processing delaysand buffering delays in the LimeSDR-USB platform. The processing delays canbe neglected as the main processing tasks of the hardware logic is to convertstreams of samples to sample data packets and vice versa. This conversion in-volves buffering of sample data and is dependent on the implementation of thedata paths in the LimeSDR FPGA. It is necessary to have a closer understandingof the FPGA implementation of these data paths.

The RX data path is responsible for converting sample streams to packets.It incurs buffering delays, as it has to wait for the samples which are regulatedby the sampling rate used. On the other hand, the TX data path receives datapackets from the host computer using a high-speed parallel interface. So all thedata necessary for creating the sample stream is available to the TX data path,and it does not experience buffering delays. With this in mind, we focus more onthe RX data path.

5.1.1 Analytical understanding of the LimeSDR FPGA RX datapath

In this subsection, the report concentrates on highlighting the implementationdetails of the RX data path in the LimeSDR FPGA and describing a mathematicalequation for the RX data path delays to find out the suitable choice of parametersfor decreasing the RX data path delay.

Figure 5.1: LimeSDR FPGA RX Path

Page 69: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 5. MITIGATION METHODS 57

5.1.1.1 RX Path:

Figure 5.1 shows the overview of the RX Path implementation on the LimeSDR-USB FPGA. The hardware logic is described as a number of processing blocksconnected via signals to make the design modular and understandable. Thisparagraph uses quoted text and italics to represent processing blocks and sig-nals respectively.

Sample data from the LMS7002M is captured by the "LMS7002_DDIN", whichuses an ALT DDIO IP [10] block to capture in-phase and quadrature samples atdouble data rate. "LMS7002_DDIN" latches the sample data collected at the pos-itive and negative edge of the clock into two different components: DIQ_H andDIQ_L respectively. It introduces one clock delay between the LMS7002M Sam-ple Stream, and the DIQ_L, DIQ_H outputs.

The "RXIQ" block is responsible for arranging the individual components intoa single data structure of In-phase component of a sample followed by its Quadra-ture component. It consumes four component data coming from the "LMS7002_-DDIN" block and arranges them in a sequence: DIQ_L, DIQ_H, DIQ_L, DIQ_H."RXIQ" takes one clock cycle for latching these component data coming from the"LMS7002_DDIN". It takes three clock cycles to produce the structure. Finally, theoutput is latched after one clock cycle from the internal output register. It writesthis structure every two clock cycles to the "FIFO" by enabling the wrreq signal.

The first "SMPL_COUNTER" is a 64-bit counter which increases its count ev-ery time "RXIQ" asserts the wrreq for writing new data samples to the "FIFO". Thecount is sent to the TX Path to track the number of produced samples. The sec-ond "SMPL_COUNTER" is also a 64 bit counter which is enabled when the datais read from the "FIFO" by the "DATA2PACKETS" block. It increases its countevery clock cycle when the rdreq signal is ’1’.

The "FIFO" uses the FPGA RX Phased Lock Loop (PLL) output as the readand write clocks. The RX PLL frequency is equal to the sampling rate of thedata converters. The read enable is controlled by the "RXIQ" using the wrreqsignal, so two new samples are written in two clock cycles of the RX PLL. The"DATA2PACKETS" controls the write enable signal (rdreq). The "FIFO" receivesthe 48-bit data samples, stores them in a FIFO structure and keeps track of howmany samples have been loaded into the FIFO. The "FIFO" takes one clock cyclefor latching the input. The counter for the number of elements loaded into the

Page 70: Timing delay characterization of GNU Radio based 802.15.4

58 CHAPTER 5. MITIGATION METHODS

FIFO is updated five clock cycles after the data has been latched.

The "DATA2PACKETS" block is responsible for converting data samples intopackets. It mainly consists of two Finite State Machine (FSM). The first FSMcontrols the read and write signal for the input and output blocks. It monitorsthe amount of data in the "FIFO", and when it is greater than the amount ofdata in one packet it asserts the rdreq signal. It takes two clock cycles for the"DATA2PACKETS" to register that the number of data elements in the "FIFO" issufficient for one FPGA packet. The first FSM takes two more clock cycles to gen-erate the rdreq signal after that.

The second FSM arranges the data in the FPGA data packet structure. Itlatches the value of the sample count from the second "SMPL_COUNTER" whenthe first FSM asserts the rdreq signal. It adds the sample count and different flagsas meta-data for each LimeSDR FPGA packet. Once the first FSM determinesthere is enough space to write the data in the FX3 buffer, it enables the wrreq sig-nal, and writes 64 bits of data every clock cycle. The second FSM takes 6 clockcycles to output the first data element read from the "FIFO" after the rdreq hasbeen generated by the first FSM.

5.1.1.2 RX Path Delays

The delays introduced by the individual blocks have been highlighted in the pre-vious paragraph and visualized using Figure 5.2. All these delays have beenmeasured by simulation of the RX data path in Modelsim.

• Delay introduced by the "LMS7002_DDDIN" block: 1 clock cycle

• Delay introduced by the "RXIQ" block: 5 clock cycles; 1 clock cycle each forinput and output latching, and 3 clock cycles for forming the data structure.

• Delay introduced by the "FIFO" block: 6 clock cycles; 1 clock cycle for inputlatching; 5 clock cycles for the increment of an internal counter

• Delay introduced the the "DATA2PACKETS" block: 10 clock cycles; 4 clockcycles for generation of rdreq signal and 6 clock cycles for the output of thefirst data element read from the "FIFO"

The report will refer to these block delays as constant delays. We define sam-ple index as the number of samples that have entered the RX data path, with the

Page 71: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 5. MITIGATION METHODS 59

Figure 5.2: LimeSDR RX data path timing diagram

first sample from the LMS7002M entering the RX data path having sample indexzero. The constant delay is 22 clock cycles for even index samples, and 21 clockcycles for odd index samples. The even index samples take one more clock cyclemore as they have to wait for the odd index samples in the "RXIQ" block. Theconstant delays can be represented as:

samplrel = samplabs mod N (5.1a)

packet_number =

⌊samplabs

N

⌋(5.1b)

∆constant =22− bsamplrel mod 2c

fs(5.1c)

N = Number of samples in one packet, fs is the FPGA RX PLL frequency,samplrel is the relative sample index, samplabs is the absolute sample index inEquation 5.1, mod is modulus operator and b.c is the floor operator.

For example, with N=1020, samplabs = 8000, packet_number would be⌊80001020

⌋=

7, and samplrel would be 8000 mod 1020 = 860.

In addition to the constant delays, there are two more delays that need tobe considered. They are the Queuing Delay and the Streaming Delay. The dataelements in the "FIFO" has to wait until there is sufficient data for a single packet.

Page 72: Timing delay characterization of GNU Radio based 802.15.4

60 CHAPTER 5. MITIGATION METHODS

This waiting time is being referred to as Queuing Delay. The Queuing Delay isdependent on the relative sample index. If the element is the N th sample, it hasto wait for N clock cycles, where N is the number of samples in one packet. If theelement is the N − 1th sample, it needs to wait only for one more sample, that istwo clock cycles.

∆queuing =N − 2×

⌊samplrel

2

⌋fs

(5.2)

The "DATA2PACKETS" block outputs two samples in one clock cycle, stream-ing out the stored FIFO data sequentially. The time a sample has to wait to bestreamed is being referred to as the streaming delay. The streaming delay is alsodependent on the arrival time of the sample which is equivalent to the absolutesample index. If the sample has relative sample index N , it is the first data ele-ment of a packet, it does not need to wait for any sample to be streamed aheadof it, so it has zero streaming delay. On the other hand, the sample with relativesample index N − 1 needs to wait for N-3 samples to streamed first before it isoutputted together with the N − 2th sample.

∆streaming =

⌊samplrel

2

⌋fs

(5.3)

The total delay can be calculated as :

∆total = ∆constant + ∆queuing + ∆streaming (5.4a)

∆total =22 +N −

⌊samplrel

2

⌋− bsamplrel mod 2cfs

(5.4b)

5.1.2 Decreasing ∆total

In order to decrease ∆total as described by equation 5.4, we can either decrease Nor increase fs as samplrel is dependent on the arrival time of the sample, whichwe cannot control. Since the primary focus of the project was to evaluate the per-formance of IEEE 802.15.4 networks, we keep fs as 4 MHz. Another workaroundmight be increasing fs and using software decimation, but this leads to highersoftware processing delays as the software now needs to process a higher numberof data samples. ∆total can also be decreased by decreasing N which will increasethe amount of data transferred through the USB interface as every LimeSDRFPGA packet is packed with 16 bytes of packet header. This increase in the sizeof USB transfer data can easily be tolerated by the USB interface as in the default

Page 73: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 5. MITIGATION METHODS 61

case we are using roughly 4% of the total bandwidth provided by the USB inter-face. Because of this, we concentrate on decreasing N.

We do a min-max analysis of ∆total using equation 5.4 to observe the impactof N on the overall latency and jitter. The incoming data sample depending onits sample index can experience any delay between the minimum and maximumvalues of 5.4 with equal probability. Because of this, the maximum amount ofjitter contributed by the FPGA RX path can be estimated by finding the differencebetween the minimum and maximum values of equation 5.4. Table 5.1 showsthe approximate results using equation 5.4 for four different values of N : 1020,764, 508 and 252. These values of N were chosen as they correspond to samplesneeded for 4 kB, 3 kB, 2 kB and 1 kB of USB Transfer size respectively with 16bytes of LimeSDR FPGA packet header.

N Min Max Max-Min1020 133 260.5 127.5764 101 196.5 95.5508 69 132.5 63.5252 37 68.5 31.5

Table 5.1: Min-Max analysis for different values of N

In order to experimentally validate the results from Table 5.1, we configurethe TX and RX data paths of the LimeSDR FPGA to use 1 kB, 2 kB, 3 kB and 4kB as the LimeSDR FPGA packet size. The Cypress EZ-USB FX3 initializes theDMA channel between the GPIF II socket and USB socket with DMA buffer sizeof 4 kB. The USB Transfer of lower LimeSDR FPGA packet sizes has to wait untilone DMA buffer size is filled as the USB socket is notified only when a DMAbuffer is full. Because of this, in principle we are still using the default LimeSDRFPGA packet size of 4 kB. We modified the DMA buffer descriptor on the EZ-FX3microprocessor to make one DMA buffer size same as the LimeSDR FPGA packetsize. As the minimum configurable DMA buffer size of the Cypress EZ-FX3 is 1kB, we set the LimeSDR FPGA packet sizes as multiples of the 1 kB. Finally, theLimeSDR driver on the host computer is modified to handle the reduced size ofthe LimeSDR FPGA packet structure.

Page 74: Timing delay characterization of GNU Radio based 802.15.4

62 CHAPTER 5. MITIGATION METHODS

5.2 Software Chain Delays

Although the software chain delays are heavily dependent on the host computerprocessing resources, how the execution of GNU Radio blocks is controlled by thescheduler also has an impact on the software chain delays. In case of software im-plementation of signal processing algorithms, the number of data elements to beprocessed has a direct correlation to the time taken by the algorithm, with highernumber of data elements resulting in higher processing time. In GNU Radio,as most of the blocks are signal processing blocks the number of data elementsprocessed in one execution of these blocks impacts the latency through the flow-graph. If one of the GNU radio blocks in a flow graph takes significantly longerprocessing time, then the buffers of the downstream blocks run dry. In this case,the scheduler fails to pipeline the block executions properly thereby increasingits latency. Decreasing the processing time for the GNU Radio block might leadto better pipelining lowering the latency of the flow graph. Each execution ofa GNU Radio block is associated with a number of control signals as describedin Section 2.1.3. Increasing the number of executions of a GNU Radio block willlead to an increase of the number of control signals, which will increase the pro-cessing overhead. We need to investigate the impact of the number of elementsprocessed by the flow graph to find the tradeoff between lower latency and in-creased processing overhead.

GNU Radio scheduler allows us to control the maximum number of data el-ements processed in one execution of a GNU Radio processing block by usingmax_noutput_items method. This method caps the maximum number of dataelements but the scheduler can still decide on the actual number of elements pro-cessed in one execution. We decided to set the max_noutput_items for one theentire TX and RX flow graphs, instead of tuning this parameter for each block.This will help the GNU radio scheduler to optimize the flow graphs for latencyin case there is any processing block having longer processing delays.

5.3 Experiment 4: Evaluation of Mitigation Methods

This experiment is designed to evaluate the impact of smaller LimeSDR FPGApacket size and the maximum number of elements processed by a GNU Radioblock on the latency and component delays.

• Objective This experiment is designed to evaluate our mitigation methodsdescribed previously using the experimental setup described in Figure 4.6.

Page 75: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 5. MITIGATION METHODS 63

This experiment will help in analyzing which configuration among the mea-sured configuration of USB Transfer size and GNU Radio execution sizeprovides the best results. Since this experiment is done across both the hostcomputers, the impact of processing resources will also be studied with re-spect to our mitigation strategies.

• Input Parameters The experiment uses two input parameters:

– LimeSDR FPGA Packet Size Following the discussion in 5.1.2, we setthe LimeSDR packet size to 1 kB, 2 kB, 3 kB and 4 kB (default case).

– GNU Radio n_output_items We set the GNU Radio flow graph’sn_output_items to 1000, 2000 and 4000 data elements.

• Output Metric Latency and Component Delays as described in Experiment2. As the best configuration of the input parameters for the host computerswill be dependent on its processing resources, the percentage of CPU andmemory usage will also be measured using pidstat.

• Design of Experiment The experiment collects 999 latency and componentdelays over 500 seconds. The time period of the message source is set to 500ms, the message source generates 10 bytes data payload. Pidstat collectsthe percentage of CPU and memory usage every one second (least possible).The sampling rate is set to 4 MHz as we are evaluating the latency withrespect to IEEE 802.15.4 networks.

5.4 Experiment 5: Throughput Analysis

Both of our mitigation strategies revolve around the idea of decreasing the trans-fer time or processing time. This decrease comes at the cost of increase in thenumber of times data has to be transferred and the number of times a blockneeds to be executed which results in increased processing overhead. Since weare dealing with multi-core processors, we assume that the increase in processingoverhead can be compensated with the better pipelined processing providing usbetter latency. In this case, although we might achieve lower latency, the through-put of the entire system can be affected because of the increase of the processingoverhead for handling the same amount of data. We need to evaluate how ourmitigation strategies affect the throughput of the system.

We send a controlled data payload size from the periodic message source us-ing experimental setup shown in Figure 4.2. The time when the SFD is detected is

Page 76: Timing delay characterization of GNU Radio based 802.15.4

64 CHAPTER 5. MITIGATION METHODS

noted as TSFD, when the complete packet is decoded, we note the time-stamp asTComplete. As we control the data payload size, we know the size of data decodedbetween the two timestamps. We can then define throughput as;

Throughput =data_payload_sizeTComplete − TSFD

(5.5)

This throughput definition provides the MAC layer throughput as we are calcu-lating our throughput using the MAC data payload size.

The best combination of input parameters as determined from the results ofExperiment 4 for the desktop computer is taken as the input parameter for thismeasurement. We compare the throughput for this configuration to that of thedefault configuration of the LimeSDR platform. We set the data payload size tothe 112 bytes with the time period of the message source is set to 500 ms. Theexperiment was run for 500 seconds resulting in 999 throughput measurements.

Page 77: Timing delay characterization of GNU Radio based 802.15.4

Chapter 6

Results and Analysis

This chapter presents the results of our timing characterization experiments de-scribed in the Chapter 4. The analysis for these experiments are presented to-gether with the results. We then present the evaluation for our mitigation meth-ods presented in Chapter 5.

6.1 Impact of data payload size and sampling rateon the overall latency

The section introduces the results for our experiment 1 described in the Section4.4 . We use this experiment to analyze the impact of network parameters suchas latency and data payload size on the latency. Since we run this experimenton both the host computers the result will give us indications on the impact ofprocessing resources.

Figure 6.1 and Figure 6.2 show the results of Experiment 1 for the laptop anddesktop computer respectively. We chose bar graph for visualization of the re-sults as it helps depict the impact of both the data payload size and samplingrate on the overall mean and standard deviation of latency clearly. The height ofeach bar represents the mean latency with the standard deviation shown as theerror bars on top of each bar. The latency variation with respect to the data pay-load size is shown as adjacent bars of different colors. In these figures, the x-axisshows the sampling rate, the y-axis shows time in µs.

The laptop with the lower processing resources shows the latency increaseswith increasing sampling rate and increasing data payload size, with message

65

Page 78: Timing delay characterization of GNU Radio based 802.15.4

66 CHAPTER 6. RESULTS AND ANALYSIS

Figure 6.1: Experiment 1: Latency vs Sampling Rates, Message Sizes for lap-top(lower processing resources)

Figure 6.2: Experiment 1: Latency vs Sampling Rates, Message Sizes for desk-top(higher processing resources)

size of one byte for 4 MHz sampling rate provides the lowest mean and standarddeviation of latency. The laptop struggles processing the continuous data streamleading to significantly higher mean and standard as shown by the increase inheight of the bars and the length of the error bars respectively. Figure 6.1 alsohighlights the broader trend of higher sampling rates and larger message sizeshaving higher standard deviation in the latency measurements with all the mea-surements for 16 MHz having standard deviation over 1 ms. This gives us an

Page 79: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 67

indication that we need higher processing resources in order to handle protocolswith higher bandwidth requirement.

The desktop computer with higher processing resources shows more consis-tent results for the 4 MHz sampling rate case for all message sizes. This is mainlybecause our definition of latency is asymmetric, making it independent of thedata payload size. For lower message sizes, sampling rate of 8 MHz shows thebest results, while diverging with increased data payload size. The standard devi-ation of latency for 8 MHz and 16 MHz sampling rate, increases with the increaseof data payload size, indicating we might need better processing resources forhandling the higher bandwidth protocols. Comparison of Figure 6.1 and Figure6.2 clearly highlight the impact of processing resources on the latency, with thedesktop computer having lower latency for all combinations of the input param-eters.

6.2 Component Delay Analysis

In this section, we present the results of Experiment 2 described in Section 4.5.We analyze the results of this experiment to get a deeper understanding of thesystem and investigate the reasons for the results shown in the previous section.This results and subsequent analysis also helps us in the selection of our mitiga-tion methods.

Figure 6.3 and Figure 6.4 show the results of the component delays for thelaptop and desktop respectively. The x-axis shows the data payload size and they-axis shows the component delays in µs. The results have been shown usingbubble plot as it helps to understand both the median of the data samples andvariation of the measurements. The height of each bubble shows the median de-lay whereas the area of the bubbles shows the difference in latency between the95 percentile value and the 5 percentile value. This helps us visualize the varia-tion of component delays across measurement data samples. The larger the areaof the bubble, higher the amount of jitter contributed by it.

Both the results show an increase in TX software delays (T4−T1) with messagesizes. This is understandable as the TX software chain needs to process more datafor higher data payload size. Although the rate of increase is different for boththe computers, with the laptop measurements showing a faster rate of increase

Page 80: Timing delay characterization of GNU Radio based 802.15.4

68 CHAPTER 6. RESULTS AND ANALYSIS

Figure 6.3: Experiment 2: Component Delays vs Message Sizes for laptop(lowerprocessing resources)

Figure 6.4: Experiment 2: Component Delays vs Message Sizes for desk-top(higher processing resources)

compared to the desktop computer. The GNU Radio TX flow graph is process-ing intensive, the higher CPU clock speed and higher number of CPU cores helpthe desktop computer to desktop computer to process faster. The RX software

Page 81: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 69

delay (T8 − T5) is more or less constant across the message sizes. It is primarilybecause of our definition of T8, the RX software chain needs to process the sameamount of data regardless of the data payload size. The desktop computer per-forms slightly better compared to the desktop computer, taking approximately(insert value) less time.

The LimeSDR loopback delay (T5 − T4) decreases with message sizes. We hy-pothesize this pattern is caused by the buffering of data between the LimeSDRFPGA and Cypress EZ-USB FX3 on the TX Path of the LimeSDR-USB. So loweramount of data needs to shift across these buffers before finally reaching theFPGA TX Path for transmission. The shift clock for the TX FIFOs is dependent onthe sampling rate, which is 4MHz while the buffer write clock is 100 MHz. Forlarger message sizes, the buffer is filled up faster, as more data is available henceit needs to shift for significantly shorter time.

The measurements across both the computers have similar LimeSDR loop-back delays, indicating that our time probes placement for the measurement arecorrect and the measurements are independent of the host computer processingresources. Our uniform results across message sizes for 4MHz sampling rate(shown in Figure 6.2) for the desktop computer can be attributed to compen-sation of the increase of TX processing delay by the decrease LimeSDR loopbackdelay. In case of the laptop, although the LimeSDR loopback delay decreases, theTX processing time increases with the increase in message sizes. Hence, the totallatency increases with data payload size as the RX software chain delay is moreor less constant.

An additional experiment was performed on the desktop computer for mea-suring the impact of sampling rate on component delays. The LimeSDR loopbackdelay decreases for the 8 MHz sampling rate, the measurement for 16 MHz in-creases because we had to increase the USB transfer size to have a reasonableamount of measurement samples. The decrease of LimeSDR loopback delay withsampling rate explains the decrease of overall latency for sampling rate of 8 MHz,shown in Figure 6.2. The TX software chain delay increases with sampling rateas it has to process more data samples.

6.3 Impact of USB Transfer size

This section presents the results for experiment 3, which was described in 4.6. Theresults help us analyze the impact of batchsize of LimeSDR FPGA packet size on

Page 82: Timing delay characterization of GNU Radio based 802.15.4

70 CHAPTER 6. RESULTS AND ANALYSIS

Figure 6.5: Experiment 2: Component Delays vs Sampling Rates for the desktopcomputer(higher processing resources)

the overall latency and how component delays are affected by the increase in theUSB Transfer size. These results further provide the motivation for one of ourmitigation strategy.

Figure 6.6: Experiment 3: Mean Latency vs Batchsize of LimeSDR FPGA packets(1 FPGA packet = 4096 bytes)

Page 83: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 71

Figure 6.7: Experiment 3: Component Delays vs Batchsize of LimeSDR FPGApackets

Figure 6.6 shows the box plots for the latency measurements for three differ-ent batchsizes: 1,4 and 8. The box plots show that the latency and jitter increasesas the batchsize increases. This results can be analyzed with the help of Figure6.7, which shows the component delays for these latency measurements as bub-ble plots. The increase of LimeSDR loopback delay contributes significantly tothe increase of overall latency and jitter as evident from the increase in heightand area of the blue bubbles representing LimeSDR loopback delay in Figure 6.7.This result is consistent with the analysis of Schmid et al [23] that the bus transfertime is dependent on the sampling rates. It takes 255 µs to generate samples forone FPGA Packet with 4 MHz sampling rate, while it takes 1020 µs and 2040µs

respectively to generate samples for four and eight LimeSDR FPGA packets re-spectively. For this reason, the LimeSDR loopback delay increases with higherbatchsizes. With 8 MHz sampling rate it takes 127.5 µs to generate the samplesneeded for one FPGA packet, this explains the decrease of LimeSDR loopbackdelay in Figure 6.5 for the 8MHz sampling rate measurements.

Figure 6.7 also shows that the LimeSDR loopback delay for batchsize set toone is greater than the time needed to generate the samples, while it is lower thanthe sample generation time for batchsize four and eight. This leads us to believethat the TX Path delays are significant with respect to sample generation time forone FPGA packet. The RX software chain delays increase slightly with increaseof batchsizes, as the GNU Radio blocks now process more data samples in one

Page 84: Timing delay characterization of GNU Radio based 802.15.4

72 CHAPTER 6. RESULTS AND ANALYSIS

execution. The TX software chain delays are constant across all three batchsizes.

6.4 Evaluation of mitigation strategies

This section introduces the results for Experiment 4(5.3). The results focus on theimpact of LimeSDR packet size and GNU Radio max_noutput_items on theoverall latency and component delays. The percentage of CPU usage collectedusing pidstat is used for analyzing the obtained results with respect to the hostcomputer processing resources.

Figure 6.8: Experiment 4: Latency vs LimeSDR FPGA packet size for the laptop(lower processing resources)

6.4.1 Impact on latency

Figure 6.8 and Figure 6.9 show the results for the laptop and desktop computerrespectively. We use bar graphs for visualizing the results as it helps in easier rep-resentation of both the mean and standard deviation of our measurements withrespect to the input parameters. The LimeSDR FPGA packet size is representedin bytes along the x-axis, with the y-axis representing the latency (T8 − T1) in µs.The height of each bar represents the mean latency, the standard deviation of thelatency is shown as error bars. The variation of overall latency with GNU Radio

Page 85: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 73

max_ noutput_items is shown for each LimeSDR FPGA packet size as adja-cent bar plots in different shades of blue.

We get the lowest mean and standard deviation of latency for LimeSDR FPGApacket size of 3072 bytes with max_noutput_items set to 2000 for the laptop.This configuration of input parameters decreases the mean and standard devi-ation of latency to 1165 µs and 478 µs, from 1329 µs and 599 µs for the defaultconfiguration of the LimeSDR-USB platform. This gives us a decrease in meanlatency by 12%, with the jitter decreasing by 20%. Latency for LimeSDR FPGAsizes of 1024 bytes and 2048 bytes shows high mean and standard deviationswhich are in contrast to our understanding of lower LimeSDR FPGA packet sizeleads to lower latency because of lower LimeSDR loopback delay. This will beanalyzed later with respect to the processing resources.

Figure 6.9: Experiment 4: Latency vs LimeSDR FPGA packet size for the desktopcomputer (higher processing resources)

The results for the desktop computer show lower LimeSDR FPGA packet sizeslead to with lower latency with less standard deviation. We get the lowest latencyand jitter for LimeSDR FPGA packet size of 1024 bytes and max n_output_-items set to 2000 data elements. This combination of input parameters resultsin mean and standard deviation of latency decreasing to 706 µs and 248 µs re-spectively, from 1135 µs and 414 µs for the default configuration. This results inthe decrease of mean latency by 37%, with the jitter decreasing by 40%. Theseresults from the desktop computer indicate the results in Figure 6.8 are definitely

Page 86: Timing delay characterization of GNU Radio based 802.15.4

74 CHAPTER 6. RESULTS AND ANALYSIS

impacted by the processing resources of the host computer as the lower LimeSDRFPGA packet sizes result in high mean and standard deviation of latency.

In both the host computers, the impact of GNU Radio max_noutput_itemsis arbitrary across combinations of the input parameters. Since we did not pass astrict condition to the GNU Radio scheduler and the operation of the scheduler islike a black box to us, we can not determine the causal effect of max_noutput_-items on latency exactly.

6.4.2 Impact on the component delays

Figure 6.10: Experiment 4: Component Delays vs LimeSDR FPGA packet size forthe desktop computer (higher processing resources)

The component delays for different LimeSDR FPGA packet sizes with max_-noutput_items set to 2000 on the desktop computer is shown in Figure 6.10.The RX software delays and the TX software delays are further segmented intotheir individual component delays as shown in Figure 6.11 and 6.12 respectively.The results are shown as a bubble plot, with the height of the marker representingthe mean value, while the area of the bubble representing the difference betweenthe 95th percentile and 5th percentile value of the relevant component delays wehave collected. The LimeSDR FPGA packet size is represented in bytes along the

Page 87: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 75

x-axis, with the y-axis representing time in µs.

LimeSDR loopback delay increases as we increase the LimeSDR FPGA packetsize as evident from the height of the markers representing the LimeSDR loop-back delay. Increasing LimeSDR FPGA packet size also increases the area of themarkers, indicating a higher variance of the LimeSDR loopback delay. These re-sults follows the trend we estimated using Equation 5.4, although the quantitativevalue for each of these measurements are higher than those shown in Table 5.1,indicating there is additional delays on the TX Path and transfer. These results areshown for data payload size of 10 bytes and the LimeSDR loopback delay for thedefault configuration is similar to that shown of Figure 6.4 for one byte data pay-load case. We hypothesized that the LimeSDR loopback delay may be affected bythe buffering between the EZ-FX3 and FPGA as the data samples would have toshift through the buffers, this hypothesis also holds for the measurements shownhere as we are measuring for lower data payload size.

Figure 6.11: Experiment 4: RX Software Chain Component Delays vs LimeSDRFPGA packet size for the desktop computer (higher processing resources)

The RX software chain delay increases slightly with the increase of LimeSDRFPGA packet size as shown in Figure 6.10. As the LimeSDR FPGA packet sizeincreases, the GNU Radio scheduler will allocate the blocks to process highernumber of data elements to process in one execution, which results in slightlyincreased GNU Radio processing time as shown by the trend shown in Figure6.11. We see that the kernel to userspace delay decreases slightly as we increase

Page 88: Timing delay characterization of GNU Radio based 802.15.4

76 CHAPTER 6. RESULTS AND ANALYSIS

the LimeSDR FPGA packet size. This slight increase for lower LimeSDR FPGApacket sizes can be attributed to the increase of USB driver system calls.

We observed that the max_noutput_items parameter does not impact boththe RX software delay and the LimeSDR delay. LimeSDR loopback delay is inde-pendent of the GNU Radio runtime configuration, hence it is not affected by themax_noutput_items parameter. For the RX software chain, the results can beinterpreted as either there is no GNU Radio block which has higher processingdelays to be affected by the max_noutput_items parameter or the decrease inprocessing delay is compensated by the increased signaling.

Figure 6.12: Experiment 4: TX Software Chain Component Delays vs LimeSDRFPGA packet size for the desktop computer (higher processing resources)

The TX software delay decreases slightly with the increase of LimeSDR FPGApacket size. This increase is primarily because of the increase in the GNU RadioTX processing time as shown in Figure 6.12. The GNU Radio scheduler needs toschedule the execution of the blocks on the RX flow graph frequently to handlethe lower LimeSDR FPGA packet size. This cause the scheduler to interleave theprocessing of the blocks in the TX flow graph with those in the RX flow graph,increasing the buffering time for the data in the TX data path. We think this is thereason for the increase in the GNU Radio TX processing delay.

Page 89: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 77

6.4.3 Influence of processing resources

Figure 6.13: Experiment 4: Comparison of the percentage of CPU usage acrossdifferent LimeSDR FPGA packet sizes for both the host computers

Figure 6.13 shows the box-plots of the normalized percentage of CPU usagefor the two host computers. The x-axis of the two graphs shows the LimeSDRFPGA packet size in bytes, with the y-axis representing the total percentage ofCPU usage. We see that the percentage of CPU usage on the laptop is higher thanthe desktop computer for all the LimeSDR FPGA packet sizes. This is mainlybecause the desktop CPU has higher processor clock speed and two more corescompared to the laptop. In general, the percentage of CPU usage decreases withincrease in LimeSDR FPGA Packet size, indicating that lower packet sizes havesignificant processing overheads.

The laptop has very high CPU usage for LimeSDR FPGA packet size of 1024and 2048 bytes. The presence of processing overhead for lower LimeSDR FPGApacket sizes causes the laptop processing resources to throttle as it is alreadyprocessing at near maximum capacity (95%). This lack of further processing re-sources results in increased buffering and unpredictable processing of the flowgraph and that explains the higher mean and standard deviation of the latencyfor the LimeSDR packet size of 1024 byte and 2048 bytes in Figure 6.8. In caseof the desktop computer, it operates at close to 87% for LimeSDR packet size of1024 bytes, so it still has processing resources available, hence it can continue pro-cessing the data packets at the incoming data rate resulting in lower mean and

Page 90: Timing delay characterization of GNU Radio based 802.15.4

78 CHAPTER 6. RESULTS AND ANALYSIS

standard deviation of the latency as shown in Figure 6.9.

6.5 Effect on throughput

Figure 6.14: Experiment 5: Comparison of throughput for the best latency caseand the default configuration using the desktop computer.

The results for the impact of our mitigation strategies on the throughput ispresented in Figure 6.14. These are presented as box plots of our measuredthroughput to show the distribution of throughput during the duration of themeasurement. The x-axis shows the LimeSDR FPGA packet size with the y-axis representing the throughput measured using Equation 5.5 in kbps. TheLimeSDR packet size of 1024 bytes represents the results for best latency case,while LimeSDR packet size of 4096 bytes shows the results for the default case.The results highlight that although the variation in throughput is much larger forthe best latency case, all the measured throughput values are higher than the onefor the default configuration. The median of the two cases shows a slight increaseof approximately 5 kbps from the default case to the best latency case.

As the GNU Radio blocks process smaller amount of data elements for lowerLimeSDR FPGA packet size, multiple execution calls are required which makesthe data decoding process dependent on the scheduling mechanism of the GNURadio scheduler. So after one execution of the RX flow graph, if the GNU Ra-dio scheduler schedules the processing of the blocks on the RX flow graph with

Page 91: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 6. RESULTS AND ANALYSIS 79

higher probability, the data rate will increase. This is because the LimeSDR pro-vides data at the described sampling rate of 4 MHz hence the data required fordecoding is available. If the data is consumed faster by repeated calls to the RXflow graph, the data rate will increase. Otherwise, if the scheduler schedules theprocessing of some blocks on the TX Chain or the IEEE 802.15.4 MAC block, thedata has to be buffered longer, leading to lower data rate as it increases the timerequired (TComplete − TSFD) for processing the same amount of data. This unpre-dictability in the scheduling process might explain the high variation of through-put for the best latency case. On the other hand, with larger USB Transfer size, thenumber of execution calls decreases making the packet decoder less dependenton scheduling mechanism hence less variance in the throughput for the defaultcase. The increase in throughput can be attributed to the better pipelined process-ing with the desktop computer having enough processing resources to handle theoverhead of multiple execution calls as explained in the previous subsection.

Page 92: Timing delay characterization of GNU Radio based 802.15.4

Chapter 7

Conclusion

The main goal of this Master’s thesis was to have a deeper understanding of char-acteristics of the time delays introduced by different components in LimeSDRbased 802.15.4 network. To achieve this goal, we first implemented a GNU Ra-dio based 802.15.4 MAC and PHY stack for the LimeSDR platform. We definedtimestamps as our method for studying the delays contributed by the varioussoftware components. Reduction of LimeSDR FPGA packet size was chosen asthe suitable mitigation technique following the delay analysis.

For the delay analysis, we chose to study the latency and component delayswith respect to MAC payload size and sampling rate of the LimeSDR platform.The conducted experiments allow us to draw a general conclusion that increasingthe amount of data increases the latency due to the larger software processing de-lays on the host computer. Our experiments show that both of our host computeris not suitable for running higher bandwidth protocols. We observed higher USBTransfer size negatively affects the overall latency because of longer buffering de-lays on the LimeSDR.

We chose to focus on the LimeSDR loopback delay as our delay analysis in-dicated that this is a fundamental delay of the LimeSDR platform and will betrue for all implementations using the LimeSDR. The GNU Radio scheduler wastweaked to find a system configuration which results in the lowest latency. Inorder to evaluate the impact of processing resources, we ran our experiments ontwo host computers, explicitly measuring the percentage of CPU usage on ourfinal experiment. The best system configuration for the two host computers, lap-top and desktop, decreased the mean latency by 12% and 37% compared to thedefault configuration of the LimeSDR platform. This helps us lower the IFS and

80

Page 93: Timing delay characterization of GNU Radio based 802.15.4

CHAPTER 7. CONCLUSION 81

increase the spectral efficiency. The jitter in latency was also decreased by 20%and 40% for the laptop and desktop computer respectively which helps in im-proving the energy efficiency with more deterministic transmissions.

7.1 Discussion

Software Defined Radio systems move from streaming architecture of traditionalradio to one of general purpose processing. We try to compensate the increase inlatency because of this transition by reducing the buffering delays and moving itcloser to the streaming architecture. But decreasing the buffering delays increasesthe processing overhead as multiple execution calls are necessary for process-ing the same amount of data. For lowest latency, a balance needs to be formedbetween reducing the buffering delays and increasing the processing overhead.Host computers with higher processing resources can compensate this extra over-head and can reduce the overall latency significantly.

Even with our mitigation strategies, we can not satisfy the timing constraintsdefined in the IEEE 802.15.4 protocol specification. This project was able to re-duce the problem of blind spots and long IFS described using Figure 1.2, but theproblem still exists. The major timing constraint for IEEE 802.15.4 is the acknowl-edgement time of 198 µs. As MAC layer retransmissions can ensure packet de-livery in most cases reducing the problem of blind spots. This timing constraintcannot be satisfied with the LimeSDR-USB platform using a software based im-plementation. However, if we relax this acknowledgement time constraint to 10ms, a GNU Radio based setup was able to communicate with an off-the shelf Zol-ertia firefly node reliably. A point to be noted here is this experiment included adifferent setup and is outside the scope of this thesis. Also, the main motivationfor this setup was to have reliable communication with the node, hence this 10 msacknowledgment time is not the absolute lower bound for LimeSDR-USB basedsolution.

Although in this thesis, we did not concentrate on decreasing the GNU Radioprocessing time, they are quite significant. This is mainly because we were usingolder generation processors in our host computers. Modern multi-core proces-sors have inherently more processing resources with a greater number of hard-ware cores and higher clock speeds. This increased processing resources will helpsignificantly decrease the GNU Radio processing time. Furthermore, the use of

Page 94: Timing delay characterization of GNU Radio based 802.15.4

82 CHAPTER 7. CONCLUSION

Graphics Processing Units (GPUs) and technologies like Radio Frequency Net-work on Chip (RF-NoC) will help reduce it even further.

7.2 Limitations

In this study, the impact of other radio nodes and radio channel conditions werenot investigated. Because of the inherent delays discussed in this report, LimeSDRbased systems might have difficulty in reliable transmissions in highly congestedradio environment, which will increase the number of re-transmissions and over-all latency. If the network does not take into consideration the presence of theseinherent delays, the overall network performance can also be affected. Also inthis study, we focused on solving the delay problem for a low data rate network.This solution may not be feasible for higher data rate networks as it might leadto buffer overflow problems.

7.3 Future Work

The evaluation of the network performance for our implementation needs to beinvestigated in future studies. The implementation also lacks channel sensingfunctionality which is needed for reducing the impact on the overall networkperformance. Future studies can also look at the feasibility of our mitigationtechniques for higher data rate protocols. In this project, we study the effect ofcoarsely tuning of the GNU buffer sizes without taking into consideration theprocessing time of the individual processing blocks. We believe the GNU Radioflowgraph can further be optimized by taking these into consideration and finetuning the buffer sizes of individual blocks for better latency performance.

Page 95: Timing delay characterization of GNU Radio based 802.15.4

Bibliography

[1] 5G-CORAL – A 5G Convergent Virtualised Radio Access Network Living at theEdge. URL: http://5g-coral.eu/ (visited on 09/29/2018).

[2] Partha Basak and Kishon Vijay Abraham I. “USB Debugging and ProfilingTechniques”. In: (Oct. 4, 2018). URL: https://elinux.org/images/1/17/USB_Debugging_and_Profiling_Techniques.pdf.

[3] Bastian Bloessl, Christoph Leitner, Falko Dressler, and Christoph Sommer.“A GNU radio-based IEEE 802.15. 4 testbed”. In: 12. GI/ITG KuVS Fachge-spräch Drahtlose Sensornetze (FGSN 2013) (2013), pp. 37–40.

[4] Bastian Bloessl, Michele Segata, Christoph Sommer, and Falko Dressler.“Towards an Open Source IEEE 802.11 p stack: A full SDR-based transceiverin GNU Radio”. In: Vehicular Networking Conference (VNC), 2013 IEEE. IEEE.2013, pp. 143–149.

[5] Michael Buettner and David Wetherall. “A software radio-based UHF RFIDreader for PHY/MAC experimentation”. In: RFID (RFID), 2011 IEEE Inter-national Conference on. IEEE. 2011, pp. 134–141.

[6] Johnathan Corgan. “GNU Radio Runtime Operation”. In: GRCON 2015(2015).

[7] Debugfs Documentation. Linux Kernel Archives. URL: https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt.

[8] Johannes Demel, Sebastian Koslowski, and Friedrich K Jondral. “A LTE re-ceiver framework using GNU Radio”. In: Journal of Signal Processing Systems78.3 (2015), pp. 313–320.

[9] Direct digital synthesis. Oct. 2018. URL: https://en.wikipedia.org/w/index.php?title=Direct_digital_synthesis&oldid=864523777(visited on 12/04/2018).

83

Page 96: Timing delay characterization of GNU Radio based 802.15.4

84 BIBLIOGRAPHY

[10] Double Data Rate I/O (ALTDDIO_IN, ALTDDIO_OUT, and ALTDDIO_BIDIR)IP Cores User Guide. URL: https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/ug/ug_altddio.pdf.

[11] EZ-USB FX3TM SuperSpeed USB 3.0 peripheral controller. URL: http://www.cypress.com/products/ez- usb- fx3- superspeed- usb- 30-peripheral-controller (visited on 10/03/2018).

[12] Getting Started with EZ-USB R© FX3TM. URL: http://www.cypress.com/file/139296/download (visited on 12/04/2018).

[13] HART. Mar. 2016. URL: https://fieldcommgroup.org/technologies/hart (visited on 12/04/2018).

[14] “IEEE Standard for Low-Rate Wireless Networks”. In: IEEE Std 802.15.4-2015 (Revision of IEEE Std 802.15.4-2011) (Apr. 2016), pp. 1–709. DOI: 10.1109/IEEESTD.2016.7460875.

[15] Internet of Things outlook – Ericsson. Ericsson.com. Nov. 9, 2017. URL: https://www.ericsson.com/en/mobility-report/reports/november-2017/internet-of-things-outlook (visited on 08/01/2018).

[16] IPv6 over Low power WPAN (6lowpan). URL: https://datatracker.ietf.org/wg/6lowpan/about/ (visited on 12/04/2018).

[17] ISO/IEC 7498-1:1994 - Information technology – Open Systems Interconnection– Basic Reference Model: The Basic Model. URL: https://www.iso.org/standard/20269.html (visited on 09/30/2018).

[18] LimeSDR. Myriad. URL: https://myriadrf.org/projects/limesdr/(visited on 08/01/2018).

[19] LMS7002M Datasheet. Lime Microsystems. URL: https://limemicro.com/technology/lms7002m/ (visited on 09/30/2018).

[20] George Nychis, Thibaud Hottelier, Zhuocheng Yang, Srinivasan Seshan,and Peter Steenkiste. “Enabling MAC Protocol Implementations on Software-Defined Radios.” In: NSDI. Vol. 9. 2009, pp. 91–105.

[21] André Puschmann, Mohamed A Kalil, and Andreas Mitschele-Thiel. “Im-plementation and evaluation of a practical SDR testbed”. In: Proceedingsof the 4th International Conference on Cognitive Radio and Advanced SpectrumManagement. ACM. 2011, p. 15.

[22] Thomas Schmid. “Gnu radio 802.15. 4 en-and decoding”. In: UCLA NESLTR-UCLA-NESL-200609-06, Tech. Rep (2006).

Page 97: Timing delay characterization of GNU Radio based 802.15.4

BIBLIOGRAPHY 85

[23] Thomas Schmid, Oussama Sekkat, and Mani B Srivastava. “An experimen-tal study of network performance impact of increased latency in softwaredefined radios”. In: Proceedings of the second ACM international workshop onWireless network testbeds, experimental evaluation and characterization. ACM.2007, pp. 59–66.

[24] Signals & Slots | Qt 4.8. URL: https://doc.qt.io/archives/qt-4.8/signalsandslots.html (visited on 08/20/2018).

[25] The Differences Between Receiver Types, Part 1. Feb. 2016. URL: https://www.mwrf.com/systems/differences-between-receiver-types-part-1 (visited on 12/04/2018).

[26] Nguyen B Truong and Chansu Yu. “Investigating latency in GNU softwareradio with USRP embedded series SDR platform”. In: Broadband and Wire-less Computing, Communication and Applications (BWCCA), 2013 Eighth Inter-national Conference on. IEEE. 2013, pp. 9–14.

[27] USB Data Transfer Types. URL: http://www.jungo.com/st/support/documentation/windriver/10.2.0/wdusb_manual.mhtml/USB_data_transfer_types.html.

[28] USBMon Documentation. The Linux Kernel Archives. URL: https://www.kernel.org/doc/Documentation/usb/usbmon.txt.

[29] WARP Project. URL: https : / / warpproject . org / trac (visited on08/02/2018).

[30] Wime Project. URL: https://www.wime- project.net/ (visited on10/03/2018).

[31] Zigbee Alliance. URL: https://www.zigbee.org/ (visited on 12/04/2018).

Page 98: Timing delay characterization of GNU Radio based 802.15.4

Appendix A

Background

A.1 CSMA and TDMA

CSMA is a Layer 2 (L2) protocol in the OSI Model. It mainly comes in two va-rieties: Carrier-sense multiple access with collision detection (CSMA/CD) andCarrier-sense multiple access with collision avoidance (CSMA/CA). The flowgraphs of both are shown in Figure A.1. In the older CSMA/CD, the nodes checksthe idleness of the channel after the frame is ready. If idle it starts transmission.During transmission, it monitors the medium for collision. If collision is detected,it employs a collision recovery process, where it sends a jam signal to signal othernodes that a collision has occurred. Then it waits for a random delay and startstransmission again.

CSMA/CA tries to avoid collision, it starts off similar to CSMA/CD whereit senses to check when the channel is idle. If found idle, it starts transmission.It is difficult for wireless nodes to detect collisions simultaneously during trans-mission. Therefore, it relies on an Acknowledgement (ACK) message from thereceiving node to check if the data packet was received. If ACK is not received,the node assumes a collision has occurred and uses exponential back-off to deter-mine when the next time to re initiate transmission.

TDMA is also a L2 protocol, where a coordinator schedules medium accessto the nodes in a periodic manner. Communication happens in time-slots. Eachnode in the network is given exclusive access to transmit during its time slot.The coordinator generates beacon signals periodically to maintain relative timesynchronization. On receiving the beacons, the nodes adjust their transmit clocksso that they have the correct estimate of their time-slots.

86

Page 99: Timing delay characterization of GNU Radio based 802.15.4

APPENDIX A. BACKGROUND 87

Figure A.1: CSMA flow graph.

A.2 GNU Radio

A.2.1 GNU Radio Block Types

The relationship of input and output elements defines the type of the GNU Radioprocessing block as shown in Table A.1. The type of block indicates the scheduleron how the block processes information. There are two types of blocks: Syn-chronous block and block. For synchronous blocks, there is a rational relationshipbetween the input and output elements. The sink, source, interpolation blockand decimation block in Table A.1 are synchronous blocks. The key differencebetween different block types is in how the scheduler handles the input and out-

Page 100: Timing delay characterization of GNU Radio based 802.15.4

88 APPENDIX A. BACKGROUND

Number of input elements Number of output elements NameN 0 Sink Block0 N Source BlockN 1 Interpolation block1 N Decimation blockM N General Block

Table A.1: GNU Radio Block Types

put buffers of each block. For the synchronous blocks, the scheduler implicitlyhandles the input and output pointers to the buffers. For general blocks, the workfunction needs to explicitly pass the information on how many elements it con-sumed and produced.

A.2.2 GNU Radio Interfaces

Since in GNU Radio flow graph, data is passed from one node to another, themethod of passing the data among different blocks needs to be defined. Thismethod is defined in the block interfaces. Stream Interfaces are intended to streamlarge amounts of data between blocks with variable processing rates. They uselarge buffers to pass the data from one node to another. Stream interfaces workwell for samples, bits etc. but they are not the right method to pass metadata,control information or bursts of data between blocks as it involves significantoverhead.

GNU Radio recently added the message passing interface for handling asyn-chronous message passing. GNU Radio also supports stream tags for handlingmetadata as it is closely associated with the stream data samples. Stream tagsare attached with stream data samples and provide additional information asso-ciated with the sample. It can be used both for passing control flags as well asmetadata information like the Packet Data Unit. (PDU) size, timing informationetc. These stream tags are propagated to the next blocks and is updated by thedata rate changes. For example, if the block takes it 2 samples as input and pro-duces 4 samples as output, its data rate is 2. In this case, if the input stream had astream tag at position x then the the location in the output stream would be 2x.

Page 101: Timing delay characterization of GNU Radio based 802.15.4

APPENDIX A. BACKGROUND 89

A.3 LMS7002M

Figure A.2: Block Diagram of LMS7002M.

LMS7002M offers full duplex communication link on both the TX and RXchains. Each of the RX chains has three separate RF ports tuned for narrow bandlow frequency, narrow band high frequency and wide band operations. Similarly,the TX Chains are connected two separate RF ports tuned for high frequency and

Page 102: Timing delay characterization of GNU Radio based 802.15.4

90 APPENDIX A. BACKGROUND

low frequency operations. This separation is done for better impedance matchingat the boundary of the antennas.

Figure A.2 shows the functional block diagram for a LMS7002M FPRF. Since,both the RX and TX paths are identical, the report concentrates on only one RXpath. The output from the RF RX ports are fed into the LNA inorder to minimizeinjecting too much noise at the beginning of the chain. The receiver follows thearchitecture shown in Figure 2.2, with a RX mixer, followed by filter and a Pro-grammable Gain Amplifier. (PGA) combined in a Zero-IF architecture. The RXPGA outputs the analog baseband signal.

LMS7002M uses a fractional N-PLL architecture for the local oscillator fre-quency synthesis. PLL is used extensively in RF circuits for making sure the gen-erated local oscillator signal and the reference signal have the same phase andfrequency. PLLs are essentially negative feedback systems, so when the inputsignal differs a lot from the output signal, the control logic tries to lower the error(input-output).

Integer N- PLL architectures are used to generate high frequency signals fromlow frequency reference clocks, by using a frequency divider in the negative loop-back path. The frequency divider is basically a counter, that outputs every "N"(division factor of the loop) clock cycles of the output signal. But since the outputsignal frequency will be multiples of the reference clock, the output signal reso-lution is determined by the reference clock.

So to have a high frequency as well as a high resolution, the divider countershould be very large in size. To counter the problem, fractional N-PLL architec-tures were designed where the output signal frequency can also be a fractionalmultiple of the input signal frequency. This helps in increasing the frequencyresolution without the need for a large divisor counter. The input and outputfrequency relationship for a fractional N-PLL can be summarized by: fout =

fref (N + k/M), where N is the integer divider factor, k is the fractional dividerfactor and 1/M gives the output frequency resolution. Both the integer and frac-tional divider factor are determined by the size of the counters used. In case ofLimeSDR, the reference signal fed to the PLL varies from to 10 to 52 MHz. Theoutput signal can vary from 30 to 3800 MHz, with a frequency resolution of 24.8Hz.

Once the RF demodulation is completed by the analog processing chain, the

Page 103: Timing delay characterization of GNU Radio based 802.15.4

APPENDIX A. BACKGROUND 91

analog signal is sent to the data converters and converted to digital data samples.The sampling rate for the data conversion is determined by the required RF chan-nel bandwidth. The digital samples are sent to the Transreceiver Signal Processor(TSP) for further processing.

The TSP uses advanced signal processing algorithms like IQ DC offset correc-tion, IQ phase correction for correcting the received samples. An interpolationand decimation filter is added to the TSP for the TX and RX chains respectively.These filters are implemented with a chain of five fixed co-efficient half band FIRfilters, which allows interpolation and decimation factors of 1,2,4,8,16. Interpo-lation and Decimation allows the baseband to run at a lower data rate while stillrunning the data converters at higher sampling rates, enabling the quantizationnoise to be spread over larger frequency range. Automatic Gain Control is alsoimplemented by the the TSP.

A.4 USBMon

The details about the USBMon IO Traces for the text and raw binary interfaces ispresented here.

A.4.0.1 Text Data Format

URB Tag Timestamp Event Type Address URB Statusffff8fbdbbae4000 2942307806 S Bo:3:008:15 -115Data Length Data Tag Data

64 = 21000100 00000000 002a0484 00000000 000000

Table A.2: Text USB Trace Example.

• URB Tag: URB Identification number, it is usually the in kernel adress ofthe URB structure.

• Timestamp: The timestamp for the URB event at the HCD in microseconds.It is measured by the usbmon main utility using gettimeofday() function oftime.h.

• Event Type: It specifies the event type of the HCD event. S - Submission C-Complete E - submission error.

Page 104: Timing delay characterization of GNU Radio based 802.15.4

92 APPENDIX A. BACKGROUND

• Address: It consists of four fields separated by colons. The URB type anddirection, bus number, device number, endpoint number. The URB typeand direction specifies the type of USB transfer(can be both synchronousand asynchronous).

Bi Bo Bulk Input and Output.Ci Co Control Input and Output.Ii Io Interrupt Input and Output.Zi Zo Isochronous Input and Output.

Table A.3: URB Type and Direction.

The USB device transfers data through a pipe to a memory buffer on thehost and endpoint on the device. The type of data transfer depends on theendpoint and the requirements of the function. The transfer types are asfollows[27]:

– Control Transfers: It is mainly used for configuration, command andstatus operations.

– Bulk Transfers: Bulk Transfer are used for bulky,non-periodic nontime-sensitive burst transmissions.

– Interrupt Transfers: It is used for mainly sending small amounts ofdata infrequently or asynchronously.

– Isochronous Transfers: Isochronous transfers are mainly used for pe-riodic, continuous streams of time sensitive data.

• Data Length: For urb_submit it gives the requested data length and forcallbacks it is the actual data length.

• Data tag: If this field is ’=’ then data words are present.

• Data: The data words contains in the USB transfer packet.

A.4.0.2 Raw Binary

The overall data format is same as the text data, the data is available in raw binaryby accessing character devices at /dev/usbmonX. The data can be read by usingread with ioctl or by mapping the buffer using mmap. The usbmon events arebuffered in the following format:

Page 105: Timing delay characterization of GNU Radio based 802.15.4

APPENDIX A. BACKGROUND 93

s t r u c t usbmon_packet {u64 id ; /∗ 0 : URB ID − from submission to c a l l b a c k ∗/unsigned char type ; /∗ 8 : Same as t e x t ; e x t e n s i b l e . ∗/unsigned char xfer_ type ; /∗ ISO ( 0 ) , I n t r , Control , Bulk ( 3 ) ∗/unsigned char epnum ; /∗ Endpoint number and t r a n s f e r d i r e c t i o n ∗/unsigned char devnum ; /∗ Device address ∗/u16 busnum ; /∗ 1 2 : Bus number ∗/char f l a g _ s e t u p ; /∗ 1 4 : Same as t e x t ∗/char f l a g _ d a t a ; /∗ 1 5 : Same as t e x t ; Binary zero i s OK. ∗/s64 t s _ s e c ; /∗ 1 6 : gett imeofday ∗/s32 ts_usec ; /∗ 2 4 : gett imeofday ∗/i n t s t a t u s ; /∗ 2 8 : ∗/unsigned i n t length ; /∗ 3 2 : Length of data ( submitted or a c t u a l ) ∗/unsigned i n t len_cap ; /∗ 3 6 : Delivered length ∗/union { /∗ 4 0 : ∗/

unsigned char setup [SETUP_LEN ] ; /∗ Only f o r Control S−type ∗/s t r u c t i s o _ r e c { /∗ Only f o r ISO ∗/

i n t error_count ;i n t numdesc ;

} i s o ;} s ;i n t i n t e r v a l ; /∗ 4 8 : Only f o r I n t e r r u p t and ISO ∗/i n t s t a r t _ f r a m e ; /∗ 5 2 : For ISO ∗/unsigned i n t x f e r _ f l a g s ; /∗ 5 6 : copy of URB’ s t r a n s f e r _ f l a g s ∗/unsigned i n t ndesc ; /∗ 6 0 : Actual number of ISO d e s c r i p t o r s ∗/

} ;

Page 106: Timing delay characterization of GNU Radio based 802.15.4

www.kth.se

TRITA -EECS-EX-2019:774