![Page 1: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/1.jpg)
Challenging Cloning Related Problems with GPU-Based Algorithms
Authors :Thierry Lavoie、Michael Eilers-Smith、 Ettore Merlo Publisher:ACM IWSC’10Presenter:Ye-Zhi ChenDate:2011/12/21
![Page 2: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/2.jpg)
• This paper describes an implementation of the Smith-Watterman algorithm for proper clone filtering
Introduction
![Page 3: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/3.jpg)
• To address the clone detection false positives problem by an appropriate filtering technique ; the DP-matching seemed to be an interesting choice
Algorithm
- A B C C A
- 0X
0X
0X
0X
0X
0X
A 0X
1↖
1←
1←
1←
1↖
B 0X
1↑
2↖
2←
2←
2←
A 0X
1↖
2↑
2↑
2↑
3↖
C 0X
1↑
2↑
3↖
3↖
3↑
![Page 4: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/4.jpg)
Algorithm
![Page 5: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/5.jpg)
GPU DP-matching :• Find what cells of the matrix are free of computational dependencies in
order to compute their values on separate cores simultaneously
• It is simple to check that every cells on the anti-diagonals become free of any computational dependencies at the same moment because their value is solely dependent on the cells of the previous anti-diagonals.
Algorithm
![Page 6: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/6.jpg)
• Let Vk represents the linear buffer computed at step k. Let fk be the following map between the Indexes of V and those of the matrix D :
u can be seen as the index of threads , s1 and s2 ‘s first character are gaps
Algorithm
![Page 7: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/7.jpg)
Algorithm- A B C C A
- 0X
0X
0X
0X
0X
0X
A 0X
1↖
1←
1←
1←
1↖
B 0X
1↑
2↖
2←
2←
2←
A 0X
1↖
2↑
2↑
2↑
3↖
C 0X
1↑
2↑
3↖
3↖
3↑
![Page 8: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/8.jpg)
The characters which are comparedtop
leftUpper left
![Page 9: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/9.jpg)
Worst case problem:• The worst case of the classical DP-matching algorithm has a quadratic
running time. • In the general worst case, the GPU-based implementation also has a
running quadratic worst time.• However, since a large number of cores perform the computation at the
same time, the hidden quadratic constant can be divided by a large factor
Algorithm
![Page 10: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/10.jpg)
• On very small instances of DP-matching problems, the CPU might outrun the GPU, mostly because of memory bandwidth limitations
• If computation on such very small instances is to be performed on a basis of one string matched against a set of strings, there’s a way of packing the data on the GPU to make the total computation more efficient.
Algorithm
![Page 11: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/11.jpg)
• Let C be a set of strings and let c0 be an element of C. Lets define C’ as:C ’= C − {c0}
The problem is then defined as matching c0 against all ci in C’.
• Practical implementations need to pad the strings to be matched.This will enforce the number of computational steps k to be the same in each sub
matrix. The length of the padding p of a ci is defined as follow:p = len(ci) − max(len(cj)|cj C)∈
• Each padded ci of C’ is then concatenated to each other separated by a special blank character
Algorithm
![Page 12: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/12.jpg)
k’s initial value is not 0,the initial value is |C’-1|*(max(len(ci)|ci C)+1)∈
the number of computationalsteps k is reduced to 2*(max(len(ci)|ci C))-1∈
![Page 13: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/13.jpg)
![Page 14: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/14.jpg)
the indexes γ correspondingto these cells can be evaluated with this equation:γ = x (max(len(ci)|ci C) + 1)∗ ∈ ∀ x {0..|C| − 1}∈
![Page 15: Challenging Cloning Related Problems with GPU-Based Algorithms](https://reader035.vdocuments.net/reader035/viewer/2022081505/56816699550346895dda8707/html5/thumbnails/15.jpg)
Equipment:Intel Core 2 Duo computer 3.00 GHz with 6MB of cache, 3GB of RAM and a
GeForce 8800GT
EXPERIMENTAL