uw-madison computer sciences vertical research group© 2010 relax: an architectural framework for...

29

Upload: kenneth-riley

Post on 17-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

UW-Madison Computer Sciences Vertical Research Group © 2010

Relax: An Architectural Framework for Software Recovery

of Hardware Faults

Marc de KruijfShuou Nomura

Karthikeyan Sankaralingam

ISCA 2010 - 3

Executive Summary Problem

Technology is driving simple hardware Fault recovery requires complex hardware

Software Recovery Enables simple hardware High energy efficiency

Relax: An Architectural Framework for Software Recovery ISA: a well-defined interface for software recovery Software: support to use the ISA Hardware: support to implement the ISA

ISCA 2010 - 4

Architecture TrendEnergy efficiency

Hardware simplification

ISCA 2010 - 5

SearchComputer Vision

Data MiningMedia Processing

Scientific Computing…

Applications TrendData-intensive, error-tolerant applications

Architecture TrendEnergy efficiency

Hardware simplification

100110101101001011001010111001010111000100001101

ISCA 2010 - 6

Vdd

OutIn

CMOS TrendDevice variability,

wear-out, soft errors

SearchComputer Vision

Data MiningMedia Processing

Scientific Computing…

Applications TrendData-intensive, error-tolerant applications

Architecture TrendEnergy efficiency

Hardware simplification

CMOS TrendDevice variability,

wear-out, soft errors

Hardware RecoverySoftware Recovery

?Applications Trend

Data-intensive, error-tolerant applications

InefficientNo flexibility

Checkpoints conservative

EfficientError tolerance

Natural recovery points

ISCA 2010 - 7

Vdd

OutIn

SearchComputer Vision

Data MiningMedia Processing

Scientific Computing…

Architecture TrendEnergy efficiency

Hardware simplification

Simple HardwareNo speculative state

Recovery Support Is Needed

Complex HardwareSpeculative state

?

ISCA 2010 - 8

Relax

Software Recovery

Hardware Detection

ISA

ISCA 2010 - 9

ISASoftwareHardware

Relax

ISCA 2010 - 10

ISA

SIMPLE HARDWARE

application

error tolerancesoftware-definedrecovery

simplicityenergy

efficiency

flexibility

Software defines recovery handler

Hardware detects and jumps to handler on faultand is allowed to commit corrupted state*

rlx RECOVER ...RECOVER: ...

*Details in paper

ISCA 2010 - 11

ISA

SoftwareHardware

ISCA 2010 - 12

-- WARNING --SOURCE CODE AHEAD

Software

int sad(int *left, int *right, int len) int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]); } return sum;}

SAD (Sum of Absolute Differences) Example(adapted from a H.264 video encoder)

ISCA 2010 - 13

ENTRY: mv 0 -> $sum ble $len, 0, EXITLOOP_PREHEADER: mv 0 -> $iLOOP: ld [$left + $i * 4] -> $tmp1 ld [$right + $i * 4] -> $tmp2 abs $tmp1, $tmp2 -> $tmp3 add $sum, $tmp3 -> $sum add $i, 1 -> $i blt $i, $len, LOOPEXIT: rlx 0 # Relax off ret $sum

Software

int sad(int *left, int *right, int len) int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]); } return sum;}

relax {

SAD (Sum of Absolute Differences) Example

int sad(int *left, int *right, int len)

int sum = 0; for (int i = 0; i < len; ++i) { sum += abs(left[i] - right[i]);

return sum;}

} recover { retry; }} recover { return INT_MAX; }

return 0x7FFFFFF # “discard”RECOVER: jmp ENTRY # “retry”

rlx RECOVER # Relax on

(adapted from a H.264 video encoder)

raw

encoded

1. No writes to memory2. Idempotent3. Recoverable by re-execution

SIMPLE + INTUITIVE + FLEXIBLE

ISCA 2010 - 14

ISA

Hardware

Software

ISCA 2010 - 15

Microarchitecture1. Fine-grained hardware detection (e.g. Argus)2. Recovery PC register + control logic

Hardware

SIMPLE MICROARCHITECTURE

ISCA 2010 - 16

Homogenous RelaxAll cores with no hardware recovery support

Hardware Organization

“Relaxed” coresNo hardware recovery

Normal coresHardware recovery

Dynamically Heterogeneous RelaxHardware recovery adaptively disabled

Statically Heterogeneous RelaxSome cores with; some cores without

FLEXIBLE DESIGN

ISCA 2010 - 17

ISASoftware

HardwareEvaluation

ISCA 2010 - 18

Evaluation

Is it useful?

How useful is it?

ISCA 2010 - 19

Is it Useful?

Application Name Percent Execution Time Contribution of FunctionBarnesHut (Lonestar) >99.9%bodytrack (PARSEC) 21.9%canneal (PARSEC) 89.4%ferret (PARSEC) 15.7%kmeans (MineBench) 83.3%raytrace (PARSEC) 49.4%x264 (PARSEC) 49.2%

Language support using LLVMOne relax region per application (most dominant function)

Retry and discard behavior

7 Applications

IT WORKS!

ISCA 2010 - 20

How Useful Is It?

Software recovery for timing speculation

ISCA 2010 - 21

Methodology

Instruction-level fault injection

Execution time model Statically Heterogeneous

Architecture

Energy model Energy-delay product (EDP) Analytical model for hardware efficiency

ISCA 2010 - 22

Results – Execution Time

barnesh

ut

bodytrac

k

canneal

ferret

kmean

s

raytra

cex2

640

0.20.40.60.8

11.2

retrydiscard

Exec

ution

Tim

e

*error rates range from 10-3 to 10-6 errors/cycle

Execution time overhead is less than 10% and 1% typical

Discard performance is comparable to retry

ISCA 2010 - 23

Results – Energy-delay

barnesh

ut

bodytrac

k

canneal

ferret

kmean

s

raytra

cex2

64-0.2-1.66533453693773E-16

0.20.40.60.8

11.2

retrydiscard

Nor

mal

ized

ED

P

*error rates range from 10-3 to 10-6 errors/cycle

Relax achieves energy improvements for timing speculation

ISCA 2010 - 24

Future Work Better software support

Compiler automation? Binary instrumentation? Nesting relax blocks?

Hardware support What are the chip-level area and power savings? Is Relax hardware truly simpler?

Other domains Software rollback for hardware transactional memory?

Tools to assist analysis of “discard” Discard is hard to reason about; non-deterministic

ISCA 2010 - 25

Summary

Emerging Architectures Many-core architectures are simple Hardware fault recovery is complex

Emerging Applications Error tolerant Large idempotent regions

Software Recovery is a natural fit Relax : an architectural framework for software recovery

ISA: an interface to define it Software: support for applications to use it Hardware: hardware that enables it

ISCA 2010 - 26

?

ISCA 2010 - 27

ISA Semantics Errors must be “spatially contained” to the target resources of a

relax block Misdirected stores and register not recoverable by Relax!

Errors must be “temporally contained” to the scope of a relax block ECC (or other technique) necessary for memory Cache coherence, cache writeback, etc. require other mechanisms

Control flow must be “legal” (follow static control flow edges) Includes hardware exceptions (must wait on detection before trap)

Atomic operations (e.g. atomic increment) are problematic Not supported (sorry)

ISCA 2010 - 27

ISCA 2010 - 28

Fault Detection

Short latencies important for Detecting misdirected stores Detecting misdirected register writes

Otherwise, latencies depend on region sizes 50 cycle regions + 5 cycle latency = 10% overhead Average region sizes in paper = 1000 cycles

Then, 10 cycle latency = 1% overhead

ISCA 2010 - 29

“Optimal” Error Rate

Error rate Error rate Error rate

EDP

Tim

e

EDP

Hardware Efficiency Execution Time Overall Efficiency

optimum