automatic synthesis of high performance linear algebra ... · vertical folding vertical folding...
TRANSCRIPT
vertical folding
vertical folding
horizontal folding
horizontal folding
horizontal folding
vertical folding
vertical folding
horizontal folding
horizontal folding
horizontal folding
8 w
ord
s in
pa
ralle
l
Parallel permutationExample: 8 points
2 words per cycle4 cycles
Streaming permutationExample: 8 points, 2 points per cycle
vertical folding
vertical folding
horizontal folding
horizontal folding
horizontal folding
0
1
2
3
4
5
6
7
0
7
1
6
2
5
3
4
0
1
2
3
4
5
6
7
0 1 2 3
0
7
1
6
2
5
3
4
0 1 2 3
0
1
2
3
4
5
6
7
0
7
1
6
2
5
3
4
0
1
2
3
4
5
6
7
0
7
1
6
2
5
3
4
0
1
2
3
4
5
6
7
0 1 2 3
0
7
1
6
2
5
3
4
0 1 2 3
0
1
2
3
4
5
6
7
0 1 2 3
0
7
1
6
2
5
3
4
0 1 2 3
⊂
samples (evaluated)
samples (evaluated)
confidence regions
ε-
η ≤ T