parallel programming with synthesis...

Post on 09-Oct-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Parallel Programming with Synthesis 2.0 Shaon Barman Rastislav Bodik Sagar Jain Yewen Pu Saurabh Srivastava Nicholas Tung

ParLab

spec: int foo (int x) {

return x + x; }

sketch: int bar (int x) implements foo {

return x << ??; }

result: int bar (int x) implements foo {

return x << 1; }

SKETCH synthesizer

x + x

x << ?? x << 1

int spec (int x) { return x*x*x - 19*x + 30; }

#define Root {| ?? | -?? |} int sketch (int x) implements spec { return (x - Root) * (x - Root) * (x - Root); }

void sketch(int x) implements spec { return (x+5)*(x-3)*(x-2); }

SKETCH synthesizer

int[16] trans(int[16] M) { int[16] T = 0; for (int i = 0; i < 4; i++) for (int j = 0; j < 4; j++) T[4 * i + j] = M[4 * j + i]; return T; }

int[16] trans_sse(int[16] M) implements trans { int[16] S = 0, T = 0; repeat (??) S[??::4] = shufps(M[??::4], M[??::4], ??); repeat (??) T[??::4] = shufps(S[??::4], S[??::4], ??); return T; }

int[16] trans_sse(int[16] M) implements trans { S[4::4] = shufps(M[6::4], M[2::4], 11001000b); S[0::4] = shufps(M[11::4], M[6::4], 10010110b); S[12::4] = shufps(M[0::4], M[2::4], 10001101b); S[8::4] = shufps(M[8::4], M[12::4], 11010111b); T[4::4] = shufps(S[11::4], S[1::4], 10111100b); T[12::4] = shufps(S[3::4], S[8::4], 11000011b); T[8::4] = shufps(S[4::4], S[9::4], 11100010b); T[0::4] = shufps(S[12::4], S[0::4], 10110100b); }

SKETCH synthesizer

A s

imp

le o

pti

miz

atio

n

Fact

or

a p

oly

no

mia

l

SIM

D t

ran

spo

se o

f a

4x

4 m

atri

x

We are now working on exploiting synthesis throughout the HPC workflow, from design of parallel algorithms to kernel tuning. Ask us about details or come to our talk.

Domain expert:

problem spec → dynamic programming → parallel scan O(2n) O(n) O(log n)

Parallel algorithm expert:

example of parallel scan network → SIMD algorithm

GPU tuning expert:

SIMD algorithm → bank conflicts → index expressions

What are the limitations behind the magic?

Sketch doesn’t produce a proof of correctness:

SKETCH checks correctness of the synthesized program on all inputs of up to certain size. The program could be incorrect on larger inputs. This check is up to programmer.

Scalability:

Some programs are too hard to synthesize. Hence our proposal to use refinement, which provides modularity and breaks the synthesis task into smaller problems.

Proposed solution: Gradually refine the program

In each programming step, optimize just one function. This reduces the scope of synthesis, improving scalability and simplifying verification.

How a classical code generator is developed:

Step 1: Codegen developer writes a skeleton of how the generated code should look like.

Step 2: He writes the code generator that produces the desired code variant, by parameterizing the skeleton.

How a synthesing code generator is developed:

Step 1: The developer writes only the template. No actual generator is developed. The synthesizer completes the template to make the code variants behave like the spec.

Try SKETCH online at bit.ly/sketch-language

top related