genetic programming on general purpose graphics processing units (gpgpgpu) muhammad iqbal...
TRANSCRIPT
![Page 1: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/1.jpg)
Genetic Programming on General Purpose Graphics Processing Units
(GPGPGPU)
Muhammad IqbalEvolutionary Computation Research Group
School of Engineering and Computer Sciences
![Page 2: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/2.jpg)
Overview
• Graphics Processing Units (GPUs) are no longer limited to be used only for Graphics:• High degree of programmability
• Fast floating point operations
• GPUs are now GPGPUs
• Genetic programming is a computationally intensive methodology so a prime candidate for using GPUs.
2
![Page 3: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/3.jpg)
Outline
• Genetic Programming• Genetic Programming Resource Demands• GPU Programming• Genetic Programming on GPU• Automatically Defined Functions
3
![Page 4: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/4.jpg)
Genetic Programming (GP)• Evolutionary algorithm-based methodology• To optimize a population of computer programs • Tree based representation• Example:
4
X Output
0 1
1 3
2 7
3 13
![Page 5: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/5.jpg)
GP Resource Demands• GP is notoriously resource consuming
• CPU cycles
• Memory
• Standard GP system, 1μs per node• Binary trees, depth 17: 131 ms per tree
• Fitness cases: 1,000 Population size: 1,000
• Generations: 1,000 Number of runs: 100» Runtime: 10 Gs ≈ 317 years
• Standard GP system, 1ns per node» Runtime: 116 days
• Limits to what we can approach with GP
5[Banzhaf and Harding – GECCO 2009]
![Page 6: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/6.jpg)
Sources of Speed-up
• Fast machines• Vector Processors• Parallel Machines (MIMD/SIMD)• Clusters• Loose Networks• Multi-core• Graphics Processing Units (GPU)
6
![Page 7: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/7.jpg)
Why GPU is faster than CPU ?
8
The GPU Devotes More Transistors to Data Processing.
[CUDA C Programming Guide Version 3.2 ]
![Page 8: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/8.jpg)
GPU Programming APIs
• There are a number of toolkits available for programming GPUs.
• CUDA
• MS Accelerator
• RapidMind
• Shader programming
• So far, researchers in GP have not converged on one platform
9
![Page 9: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/9.jpg)
CUDA ProgrammingMassive number (>10000) of light-weight threads.
10
![Page 10: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/10.jpg)
CUDA Memory Model
11
(Device) Grid
ConstantMemory
TextureMemory
GlobalMemory
Block (0, 0)
Shared Memory
LocalMemory
Thread (0, 0)
Registers
LocalMemory
Thread (1, 0)
Registers
Block (1, 0)
Shared Memory
LocalMemory
Thread (0, 0)
Registers
LocalMemory
Thread (1, 0)
Registers
Host
CUDA exposes all the different types of memory on the GPU:
[CUDA C Programming Guide Version 3.2 ]
![Page 11: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/11.jpg)
CUDA Programming Model
GPU is viewed as a computing device operating as a coprocessor to the main CPU (host).
• Data-parallel, computationally intensive functions should be off-loaded to the device.
• Functions that are executed many times, but independently on different data, are prime candidates, i.e. body of for-loops.
• A function compiled for the device is called a kernel.
12
![Page 12: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/12.jpg)
13
![Page 13: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/13.jpg)
Stop Thinking About What to Do and Start Doing It!
• Memory transfer time expensive.
• Computation is cheap.
• No longer calculate and store in memory• Just recalculates
• Built-in variables• threadIdx
• blockIdx
• gridDim
• blockDim
14
![Page 14: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/14.jpg)
Example: Increment Array Elements
15
![Page 15: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/15.jpg)
Example: Matrix Addition
16
![Page 16: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/16.jpg)
Example: Matrix Addition
17
![Page 17: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/17.jpg)
Parallel Genetic Programming
While most GP work is conducted on sequential
computers, the following computationally intensive
features make it well suited to parallel hardware:
• Individuals are run on multiple independent training examples.
• The fitness of each individual could be calculated on independent hardware in parallel.
• Multiple independent runs of the GP are needed for statistical confidence to the stochastic element of the result.
18[Langdon and Banzhaf, EuroGP-2008]
![Page 18: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/18.jpg)
A Many Threaded CUDA Interpreter for Genetic Programming
• Running Tree GP on GPU
• 8692 times faster than PC without GPU
• Solved 20-bits Multiplexor• 220 = 1048576 fitness cases
• Has never been solved by tree GP before
• Previously estimated time: more than 4 years
• GPU has consistently done it in less than an hour
• Solved 37-bits Multiplexor• 237 = 137438953472 fitness cases
• Has never been attempted before
• GPU solves it in under a day
19[W.B.Langdon, EuroGP-2010]
![Page 19: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/19.jpg)
Boolean Multiplexor
d = 2a
n = a + dNum test cases = 2n
20-mux 1 million test cases37-mux 137 billion test cases
20[W.B.Langdon, EuroGP-2010]
![Page 20: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/20.jpg)
Genetic Programming Parameters for Solving 20 and 37 Multiplexors
Terminals 20 or 37 Boolean inputs D0 – D19 or D0 – D36 respectively
Functions AND, OR, NAND, NOR
Fitness Pseudo random sample of 2048 of 1048576 or 8192 of 137438953472 fitness cases.
Tournament 4 members run on same random sample. New samples for each tournament and each generation.
Population 262144
Initial Population
Ramped half-and-half 4:5 (20-Mux) or 5:7 (37-Mux)
Parameters 50% subtree crossover, 5% subtree 45% point mutation. Max depth 15, max size 511 (20-Mux) or 1023 (37-Mux)
Termination 5000 generations
21[W.B.Langdon, EuroGP-2010]
Solutions are found in generations 423 (20-Mux) and 2866 (37-Mux).
![Page 21: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/21.jpg)
AND, OR, NAND, NOR
X Y X & Y
0 0 0
0 1 0
1 0 0
1 1 1
22
X Y X d Y
0 0 1
0 1 1
1 0 1
1 1 0
X Y X r Y
0 0 1
0 1 0
1 0 0
1 1 0
X Y X | Y
0 0 0
0 1 1
1 0 1
1 1 1
AND: &
NOR: rNAND: d
OR: |
![Page 22: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/22.jpg)
Evolution of 20-Mux and 37-Mux
23[W.B.Langdon, EuroGP-2010]
![Page 23: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/23.jpg)
6-Mux Tree I
24[W.B.Langdon, EuroGP-2010]
![Page 24: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/24.jpg)
6-Mux Tree II
25[W.B.Langdon, EuroGP-2010]
![Page 25: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/25.jpg)
6-Mux Tree III
26[W.B.Langdon, EuroGP-2010]
![Page 26: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/26.jpg)
Ideal 6-Mux Tree
27
![Page 27: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/27.jpg)
Automatically Defined Functions (ADFs)
• Genetic programming trees often have repeated patterns.
• Repeated subtrees can be treated as subroutines.
• ADFs is a methodology to automatically select and implement modularity in GP.
• This modularity can:• Reduce the size of GP tree
• Improve readability
28
![Page 28: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/28.jpg)
Langdon’s CUDA Interpreter with ADFs
• ADFs slow down the speed• 20-Mux taking 9 hours instead of less than an hour
• 37-Mux taking more than 3 days instead of less than a day
• Improved ADFs Implementation• Previously used one thread per GP program
• Now using one thread block per GP program• Increased level of parallelism
• Reduced divergence
• 20-Mux taking 8 to 15 minutes
• 37-Mux taking 7 to 10 hours
29
![Page 29: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/29.jpg)
6-Mux with ADF
32
![Page 30: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/30.jpg)
6-Mux with ADF
33
![Page 31: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/31.jpg)
6-Mux with ADF
34
![Page 32: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/32.jpg)
Conclusion 1: GP
• Powerful machine learning algorithm
• Capable of searching through trillions of states to find the solution
• Often have repeated patterns and can be compacted by ADFs
• But computationally expensive
35
![Page 33: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/33.jpg)
Conclusion 2: GPU• Computationally fast
• Relative low cost
• Need new programming paradigm, which is practical.
• Accelerates processing speed up to 3000 times for computationally intensive problems.
• But not well suited for memory intensive problems.
36
![Page 34: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/34.jpg)
Acknowledgement
• Dr Will Browne and Dr Mengjie Zhang for Supervision.
• Kevin Buckley for Technical Support.
• Eric for helping in CUDA compilation.
• Victoria University of Wellington for Awarding “Victoria PhD Scholarship”.
• All of You for Coming.
37
![Page 35: Genetic Programming on General Purpose Graphics Processing Units (GPGPGPU) Muhammad Iqbal Evolutionary Computation Research Group School of Engineering](https://reader035.vdocuments.net/reader035/viewer/2022062517/56649ed05503460f94bdf0c5/html5/thumbnails/35.jpg)
38
Thank You
Questions?