de-optimizations attack!!!

100
DE-OPTIMIZATIONS ATTACK!!! Derek Kern, Roqyah Alalqam, Ahmed Mehzer, Mohammed Mohammed Finding the Limits of Hardware Optimization through Software De-optimization Presented By:

Upload: aloha

Post on 23-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Finding the Limits of Hardware Optimization through Software De-optimization. De-optimizations ATTACK!!!. Derek Kern, Roqyah Alalqam, Ahmed Mehzer , Mohammed Mohammed. Presented By: . Outline. Flashback Project Structure Judging de-optimizations What does a de-op look like? - PowerPoint PPT Presentation

TRANSCRIPT

De-optimizationsATTACK!!!Derek Kern, Roqyah Alalqam, Ahmed Mehzer, Mohammed Mohammed

Finding the Limits of Hardware Optimization through Software De-optimizationPresented By: 1OutlineFlashbackProject StructureJudging de-optimizationsWhat does a de-op look like?General Areas of FocusInstruction Fetching and DecodingInstruction SchedulingInstruction Type Usage (e.g. Integer vs. FP)Branch PredictionIdiosyncrasies2OutlineOur MethodsMeasuring clock cyclesEliminating noiseSomething about the de-ops that didnt workLots and lots of de-ops

3FlashbackDuring the research projectWe studied de-optimizationsWe studied the Opteron

For the implementation projectWe have chosen de-optimizations to implementWe have chosen algorithms that may best reflect our de-optimizationsWe have implemented the de-optimizationsAnd, were here to report the results4FlashbackJudging de-optimizations (de-ops)Whether the de-op affects scheduling, caching, branching, etc, its impact will be felt in the clocks needed to execute an algorithm. So, our metric of choice will be CPU clock cycles

What does a de-op look like?A de-op is a change to an optimal implementation of an algorithm that increases the clock cycles needed to execute the algorithm and that demonstrates some interesting fact about the CPU in question

5The CPUsAMD Opteron (Hydra)Intel Nehalem (Dereks Laptop)

Our primary focus was the Opteron

The de-optimizations were designed to affect the Opteron

We also tested them on the Intel in order to give you an idea of how universal a de-optimization is

When we know why something does or doesnt affect the Intel, we will try to let you knowOur MethodsThe codeMost of the de-optimizations are written in C (GCC)

Some of them have a wrapper that is written in C, while the code being de-optimized is written in NASM (assembly)

E.g.Mod_ten_counterFactorial_over_array

Typically, if a de-op is written in NASM, then the C wrapper does all of the grunt work prior to calling the de-optimized NASM moduleOur MethodsProblem: How do we measure clock cycles?An obvious answerCodeAnalyst

Actually, we were getting strange results from CodeAnalyst

And, it is hard to separate important code sections from unimportant code sections

And, it is cumbersome to work withOur MethodsA better answerEmbed code that measures clock cycles for important sectionsOk.but how?Our Methods#if defined(__i386__) static __inline__ unsigned long long rdtsc(void) { unsigned long long int x; __asm__ volatile (".byte 0x0f, 0x31" : "=A" (x)); return x; }

#elif defined(__x86_64__) static __inline__ unsigned long long rdtsc(void) { unsigned hi, lo; __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); return ( (unsigned long long)lo)|( ((unsigned long long)hi)