uav relay in vanets against smart jamming with

Liang XiaoXiaozhen Lu, Dongjin Xu, Yuliang Tang,

Lei Wang*, Weihua Zhuang**Xiamen University

* Nanjing University of Posts and Telecommunications

** University of Waterloo

Nov. 30, 2017

University of Macau

UAV Relay in VANETs Against Smart Jamming with

Reinforcement Learning

Outline

Jamming attacks in vehicular ad-hoc networks

(VANETs)

Unmanned aerial vehicles (UAV)-aided VANETs

transmission against jamming

Reinforcement learning techniques

Hotbooting policy hill climing (PHC)-based relay

strategy

Simulation results

Conclusion

2

3

Jamming Attacks in VANETs

2020 VANETs market of China (billion dollars)

Jammers send faked or replayed signals to block the communication of

onboard units (OBUs) with another OBU or roadside units (RSUs) in

VANETs

Control smart vehicles such

as BMW, Mercedes and

Chrysler, 2013

Tencent Cohen Laboratory

controls Tesla vehicular

system, 2016

Fail the vehicle brake and

shutdown the engine of

Jeep’s system, 2015

First car recall event

Remote jamming attack in VANETs

Smart Jamming in VANETs

4

Apply smart radio devices such as USRPs and WARPs to observe the

ongoing transmission and eavesdrop the VANET control channel

Analyze the anti-jamming transmission policy of VANETs

Flexibly choose the jamming policy against VANETs

Large-scale network topology

OBURSUFaked/replayed

signals Smart jammer

Smart device OBU

OBU

Onboard unit (OBU)

OBU

Roadside unit

(RSU)

OBU

High mobility

Jamming Resistance of VANETs

5

Time slot

Frequency channel

1

N

...

Public Safety V2V

Service Channel

(Dedicated)

Control Channel

10 MHz Optional 20 MHz

Public Safety Intersections

Service Channel

(Dedicated)

Public Safety/Private

Service Channel

(Short Range)

Public Safety/Private

Service Channel

(Medium Range)

ch 172 ch 174 ch 176 ch 178 ch 180 ch 182 ch 184

5.850

5.855

5.860

5.865

5.870

5.875

5.880

5.885

5.890

5.895

5.900

5.905

5.910

5.915

5.920

5.925

Frequency (GHz)

IEEE 802.11p

Frequency hopping-based anti-jamming techniques are not

applicable in VANETs due to the restricted bandwidth resource

Ambient noise immunity based anti-jamming transmission to

improve the packet delivery rate in VANETs [Puñal'12]

Anti-jamming hideaway strategy determines whether to keep silent

based on the packet transmission ratio [Azogu'13]

UAV-Aided VANETs

6 UAV-aided ubiquitous coverageUAV-aided relaying

Ground gateway

Core

network

: Malfunctioning base station: Overloaded base station

UAVs relay messages for VANETs to extend coverage and improve

network connectivity

Faster to deploy

Better channel states connecting OBUs: line-of-sight links & smaller

path-loss exponents

UAVs have been used to relay mobile messages for ground terminals to

maximize the capacity of VANETs [Dixon'12]

UAVs relay vehicles’ alarm messages regarding lethal attacks to improve the intrusion detection accuracy in VANETs [Sedjelmaci'16]

Network Model

7

Jammer

Server

OBU1

OBU2 OBUi

UAV

RSU2 RSU1

Fiber

h1(k)

h2(k)

h3(k)

h5(k) h4

(k)

d5(k)

d1(k)

d2(k)

d4(k)

d3(k)

v(k)

v1(k)

v2(k)

Speed of OBU2

Channel gain between

UAV & jammer

Distance between

UAV & jammer

OBU aims to send a message to a server via RSU1 or UAV

Jammer sends faked or replayed signals to block the OBU-RSU1 link

UAV assists RSU1 to relay the OBU message to RSU2

RSU2 is far away from the jammer

Anti-jamming Transmission Game

8 X. Lu, D. Xu, L. Xiao, L. Wang, and W. Zhuang, “Anti-jamming communication game for UAV-aided VANETs,” Globecom, 2017.

UAV assists the VANET to improve the transmission quality with

less overall transmission costs

Jammer aims to degrade the VANET communication performance

with less jamming costs

Utility of the UAV: Overall transmission cost & BER of the signal received by the server

UAV

Relay strategy

SINR & bit

error rate (BER)

Jamming power

JammerQuality of the

control channel

Server RSU2 RSU1 OBU

Network element

NE of the Game

9

Nash equilibrium (NE): UAV and jammer can not increase its utility

by unilaterally leaving the NE strategy

Dynamic Anti-Jamming Trans Game

10

UAV relay process in the dynamic game can be viewed as a

Markov decision process (MDP)

Determine the relay strategy in the dynamic VANET game

without being aware of the jamming model and the network

model

State:

(1) SINR of the signals received by

UAV/RSU1 sent from OBU

(2) SINR of the signals received by

RSU2 sent from UAV

(3) BER of OBU message Next state

State

Previous

state

MDP

MDP

k-1

k

k+1

Environment

Server

RSU2 OBURSU1

Jammer

Feedback

Action:

Relay the OBU message or not

UAV

Reinforcement Learning

11

Reinforcement learning such as Q-learning can achieve the optimal

strategy for MDP with finite states

Action

Communication

performance

Learning

agent

Current

stateEnvironment

Next state

MDP

Previous

state

MDP

Typical RL Algorithms

12

A model-free

RL techniqueMixed-strategy Deep learning Hotbooting

Modeling Partial system model is known

Achieve the optimal

strategy for an MDP

Provide more randomness

to fool attackers with

uncertainty compared with

Q-learning

Use deep network

to compress state to

solve dimensional

disaster

Initialize the Q-value

with the experiences

in similar scenarios to

avoid random exploration

Without knowing

the system model Q-learning

Policy hill climbing

(PHC)

Deep Q-network

(DQN)Fast DQN

Dyna-Q

RL

TechniquesStrategy Constraints Performance

Q-learning

Channel selection Bandwidth

Communication gains minus cost & loss [Wu’12]

Successful transmission rate [Gwon’13]

Computation complexity [Aref’17]

Energy cost & packet delivery rate [Dai’17]

Power control Static networks Signal-to-interference-plus-noise ratio (SINR )

[Xiao’15]

Offloading rate Computation Attack rate [Xiao’16]

WoLF-PHC Power control Static networks SINR [Xiao’15]

DQNChannel selection

Mobility

Bandwidth

Network scaleSINR [Han’17]

RL-based Anti-Jamming Communication

13

14

PHC: Extension of Q-learning in mixed-strategy games to derive

the optimal policy for MDP

Provide more randomness in the UAV relay strategy to fool the

jammers

Avoid being induced by the jammer to a specific relay strategy

Hotbooting: Transfer learning initializes the Q-value for each action-

state pair with the experiences in similar anti-jamming UAV-aided

VANET transmission scenarios

Decrease the random explorations at the beginning

Accelerate the learning speed in the dynamic game

Hotbooting PHC-based Relay Strategy

15

Relay the OBU message

Receive from server:

(1) SINR of the signal received

by RSU1/2 and the UAV

(2) BER of OBU message

Evaluate:

(1) Transmission costs

(2) SINR of the signal received

by RSU1/2 and the UAV

Action selection

Update Q-function

Update mixed relay

strategy probability JammerServer

OBU

RSU2 RSU1

Fiber

Hotbooting preparation

Establish N experiments

with PHC algorithm

Obtain utility

Formulate state

Environment

Emulate a similar UAV-aided

VANET environment

Initialize Q function & mixed

relay strategy probability

Learning process

UAV

UAV Relay Protocol

Simulation Results

16

Server

OBU(70,15)

UAV(50,80)

RSU2(20,40) RSU1(100,40)

Fiber

v(k)

x/m

y/m

(0,0)

(90,45)

Jammer

Stationary

Simulation Results (cont.)

17L. Xiao, X. Lu, D. Xu, Y. Tang, L. Wang, and W. Zhuang, “UAV Relay in VANETs Against Smart Jamming with

Reinforcement Learning,” IEEE Transactions on Vehicular Technology, under review.

Jamming policy changed

every 500 time slots

Channel condition randomly

changed every 100 time slots

Simulation Results (cont.)

18

The transmission performance depends on the UAV location

Distance between the

UAV & the jammer: 50 m

Distance between the

UAV & the jammer: 60 m

Conclusion

We have formulated a UAV relay game for UAV-aided VANETs,

providing the NEs of the game

We have proposed a hotbooting PHC-based UAV relay strategy to

resist jamming attacks in VENETs without being aware of the

jamming model and the network model

Future work

Improve the game model by incorporating more jamming and defense

details

Accelerate the learning speed of the relay strategy to implement in

practical VANETs

Provide robustness against utility evaluation errors with incomplete state

information

19

20

Questions?

[email protected]

mailto:[email protected]

uav relay in vanets against smart jamming with

Documents