halo finder
DESCRIPTION
Halo finder - piston based implementationTRANSCRIPT
AbstractPISTON is a portable framework which supports the development of visualization and analysis operators using a platform-independent, data-parallel programming model. Operators such as isosurface, cut-surface and threshold have been implemented in this framework, with the exact same operator code achieving good parallel performance on different architectures.
An important analysis operator in cosmology is the halo finder. A halo is a cluster of particles and is considered a common feature of interest found in cosmology data. As the number of cosmological simulations carried out in the recent past has increased, the resultant data of these simulations and the required analysis tasks have increased as well. As a consequence, there is a need to develop scalable and efficient tools to carry out the needed analysis.
Therefore, we are currently implementing a halo finder operator using PISTON. Researchers have developed a wide variety of techniques to identify halos in raw particle data. The most basic algorithm is the friend-of-friends (FOF) halo finder, where the particles are clustered based on two parameters: linking length and halo size. In a FOF halo finder, all particles which lie within the linking length are considered as one halo and the halos are filtered based on the halo size parameter. A naive implementation of a FOF halo finder compares each and every particle pair, requiring O(n2) operations. Our data-parallel halo finder operator uses a balanced k-d tree to reduce this number of operations in the average case, and implements the algorithm using only the data-parallel primitives in order to achieve portability and performance.
Data-Parallel Halo Finder Operator in PISTON
Wathsala Widanagamaachchi (CCS-7)
University of Utah
Mentor : Christopher Sewell
● PISTON & motivation behind it● Data-Parallel programming ● Halos & Halo finder● Naive approach & Data-parallel approach● Results
Outline
● Portable framework ● Development of visualization & analysis
operators● Use a platform-independent, data-parallel
programming model● Motivation
Lack of visualization software which take full advantage of acceleration hardware and multi-core architecture
What is PISTON?
● What is data parallelism? ● Same operation is performed by different processors
on different pieces of data● What is Thrust?
● Thrust is a NVidia C++ template library, which provides CUDA and OpenMP backends
● Most STL algorithms in Thrust are data-parallel– sorting: thrust::sort and thrust::sort_by_key
4 5 6 8 7 2 1 3 : sort: 1 2 3 4 5 6 7 8– scans: thrust::inclusive_scan, thrust::exclusive_scan etc.
4 5 6 7 8 2 1 3 : sum scan: 4 9 15 22 30 32 33 36
Data-Parallel programming & Thrust
● Isosurface, Cut-surface & Threshold
Operators in PISTON
● What is a halo?● Feature of interest found in Cosmology data● Cluster of particles
● Halo Finder● Important analysis operator● Friend-Of-Friends (FOF) halo finder
– linking length & halo size ● Motivation behind a data-parallel solution
● Increased amount of simulation data available & analysis needed
Halos & Halo Finder
● Compares each & every particle pair
● Require O(n2) comparisons
Naive Approach
A
B
C
D
E
FG
● Balanced k-d tree from the particles● K-d tree is a.. space partitioning data structure for
organizing points in k-dimensional space● Use k-d tree to reduce the number of
comparisons
● Implement using only the data-parallel primitives● thrust::for_each, thrust::sort, thrust::transform,
thrust::scatter, thrust::gather & thrust::copy
Data-Parallel FOF Halo Finder Operator
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
A B D C F E GX rank 1 0 6 2 5 4 3Y rank 0 3 2 6 5 1 4Z rank 0 1 2 3 5 4 6
0
A, B, C, D, E, F, GK-d tree
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
0
A, B, C, D, E, F, GK-d tree
A B D C F E GX rank 1 0 6 2 5 4 3Y rank 0 3 2 6 5 1 4Z rank 0 1 2 3 5 4 6
Segment in X axis
Split value... 2.5 in X axis
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Segment in X axis
A B C D F E GX rank 1 0 2 6 5 4 3Y rank 0 3 6 2 5 1 4Z rank 0 1 3 2 5 4 6
0
A, B, C D, E, F, G
K-d tree
1 2
Split value... 2.5 in X axis
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Segment in X axis
0K-d tree
1 2
A B C D F E GX rank 1 0 2 3 2 1 0Y rank 0 1 2 1 3 0 2Z rank 0 1 2 0 2 1 3
A, B, C D, E, F, G
Split value... 2.5 in X axis
A B C D F E GX rank 1 0 2 3 2 1 0Y rank 0 1 2 1 3 0 2Z rank 0 1 2 0 2 1 3
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Segment in Y axis
0
A, B, C D, E, F, G
K-d tree
1 2
Split value... 2.5 in Y axis
Split value... 3.5 in Y axis
0
A B, C D, E F, G
1 2
K-d tree
3 4 5 6
A B C D E F GX rank 1 0 2 3 1 2 0Y rank 0 1 2 1 0 3 2Z rank 0 1 2 0 1 2 3
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Segment in Y axis
Split value... 2.5 in Y axis
Split value... 3.5 in Y axis
A B C D E F GX rank 0 0 1 1 0 1 0Y rank 0 0 1 1 0 1 0Z rank 0 0 1 0 1 0 1
0
1 2
K-d tree
3 4 5 6
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Segment in Y axis
A B, C D, E F, G
Split value... 2.5 in Y axis
Split value... 3.5 in Y axis
0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
A B C D E F GX rank 0 0 0 0 0 0 0Y rank 0 0 0 0 0 0 0Z rank 0 0 0 0 0 0 0
Balanced k-d tree Creation
A
B
C
D
E
FG
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
At each k-d tree node store parent, child details, segment details & split value
Finding Halos
0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
● Bottom-up approach● At each level, consider all nodes in the level
● Bottom-up approach● At each level, consider all nodes in the level
● Look at the split value & segment particles 0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Split value at 0 is 2.5
Finding Halos
● Bottom-up approach● At each level, consider all nodes in the level
● Look at the split value & segment particles
● Determine the particles within the linking length in the split axis
0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Split value at 0 is 2.5Linking length 2
Finding Halos
● Bottom-up approach● At each level, consider all nodes in the level
● Look at the split value & segment particles
● Determine the particles within the linking length in the split axis
● Do m*n comparisons &determine halos
● Filter halos
0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)
Split value at 0 is 2.5Linking length 2
Finding Halos
● Each node has a bounding box calculated by looking at its segment particles
● Use the BB to reduce the comparisons
Optimization - Use of Bounding Boxes
0
B C D E F GA
1 2
3 4 5 6
K-d tree
7 8 9 10 11 12
Results
24474 particles
Results
24474 particles
Linking length 0.2
Halo size 100
Halos found.. 10
Results
24474 particles
Linking length 1.1
Halo size 100
Halos found.. 5
Results
Number of particles
Number of threads
TimingsHalos foundk-d tree
creation Bounding box computation
Finding halos
21441 1 0.066s 0.00049s 0.092s
142 0.041s 0.00029s 0.052s4 0.026s 0.00021s 0.044s
42882 1 0.141s 0.0011s 0.256s
232 0.085s 0.0007s 0.142s4 0.054s 0.0005s 0.090s
Some preliminary results on halo finding using OpenMP
Next steps...Get this running on CUDACompare this with the VTK halo finder implementation
Thank You.