processor intrusion detection - testandverification.com · processor intrusion detection mark...

Processor Intrusion Detection

Mark Zwolinski

June 2018

[email protected]

@MarkZwolinski

mailto:[email protected]

Outline

• Background

• Anomalous behaviour

• Security monitoring

• Responses

• Summary

How is hardware different to software?

• Hardware exists in the real world

– Physical access allows side-channel attacks

• Implementation is not the same as design

– Timing

– Energy

• Every device is unique

– Variability

On-Chip Intrusion Detection

• Hypothesis: Embedded systems do predictable things

• Therefore anomalous behaviour occurs because something bad has happened

– Reliability problem

• One-off (radiation) or gradual (ageing)

– Security problem

• Sudden, sustained

• Resulting actions may be very different

Normal Behaviour

• Processors have built-in Hardware Performance Counters for code optimisation – can we reuse them?

• Different programs look different – Committed Instructions:

Normal Behaviour

• Different programs look different – Committed Instructions. Also varies with data:

Anomaly Detection

• Security anomaly may cause different types of unusual behaviour

– Program Counter has unusual pattern

– Cache Miss rate suddenly increases

– Temperature suddenly rises

• Can we model this easily?

– Instruction set simulator (Gem5)

– 0.15s takes 900s and generates 7.5GB data

• US$20k Microsoft Azure award

Anomalous Behaviour

• Insert single bit-flips into different registers. Different effects:

Anomalous Behaviour

• Four main types of behaviour:

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Fetch Decode Execute Load/Store

% o

f F

ault

s M

an

ife

sted

as

Err

or

Various Stages of the Pipeline

Failures Distribution on QSort Benchmark

Crash

Hang

FailSilenceViola on

NotManifested

Anomaly Monitoring

• On-chip learning

– Self-monitoring (which is part of normal pattern)

– Auxiliary processor (Quis custodiet ipsos custodes?)

– Buddy processors

• Distributed Intelligence

• Scalability?

Detection: Moving Average Cache Misses

True Positives (TP) 6

False Positives (FP) 5

False Negatives (FN) 39

True Negatives (TN) 171

Accuracy 0.80

Detection: Autoencoder Cache Misses

True Positives (TP) 4

False Positives (FP) 0

False Negatives (FN) 7

True Negatives (TN) 176

Accuracy 0.96

Detection time

• Random bit flips inserted – hang/crash/silent/not seen

• How long to detection? Minimum detection time shown in green.

• Note: Simulator is not sampling every clock cycle. Benchmarks are loops

Software Attack

• Normal behaviour on left.

• Code injection attack on right.

• (NB Sampling resolution gives slight differences.)

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

1 51 101 151 201 251 301 351 401 451 501 551 601 651

CPU Cache Overall Miss Rate

Responses

• Single event reliability problems can be mitigated by replay

• Long-term reliability problems require monitoring and preventative maintenance

• Security problems require something different

– fail-safe or fail-secure?

Summary

• On-chip monitoring for reliability

• Similar monitoring for security?

• Learn normal vs abnormal behaviour

• Fail-safe or fail-secure?

Is there any alternative?

"We believe that our results motivate a different type of defense; a defense where trusted circuits monitor the execution of untrusted circuits, looking for out-of-specification behavior in the digital domain."

K. Yang, M. Hicks, Q. Dong, T. Austin, and D. Sylvester, “A2: Analog Malicious Hardware”, Proceedings of the IEEE Symposium on Security and Privacy, May 2016.

Thanks

• Thanks to Elena Woo and Miao Yu, who did most of the simulations.

processor intrusion detection - testandverification.com · processor intrusion detection mark...

Documents