maximizing parallelism in the construction of bvhs, octrees , and k -d trees

62
Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees Tero Karras NVIDIA Research

Upload: erelah

Post on 23-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Maximizing Parallelism in the Construction of BVHs, Octrees , and k -d Trees. Tero Karras NVIDIA Research. Trees. Trees. Better. Faster. Ageia. . Collision detection. Pharr & Humphreys. NVIDIA. Path tracing. NVIDIA. Real-time ray tracing. Particle simulation. Crassin et al. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Maximizing Parallelism in the Construction of BVHs, Octrees, and k-d Trees

Tero Karras

NVIDIA Research

Page 2: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Trees

2

Page 3: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Trees

Path tracingPharr & Humphreys

Real-time ray tracingNVIDIA

Collision detectionAgeia

Particle simulationNVIDIA

Voxel-basedglobal illumination

Crassin et al.

Surface reconstructionAmenta et al.

Photon mappingUchida

Better Faster

3

Page 4: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Outline

Fastest existing methods are sequential Parallelize within each hierarchy level But not between levels

4

Page 5: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Outline

Fastest existing methods are sequential Parallelize within each hierarchy level But not between levels

Lack of parallelism Small workloads bottlenecked by top levels Sub-linear scaling of performance

5

Page 6: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Outline

Novel way to build the entire tree in parallel Two algorithmic “building blocks” Fast, scalable

6

Page 7: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Outline

Novel way to build the entire tree in parallel Two algorithmic “building blocks” Fast, scalable

Main focus: BVHs Point-based octrees and k-d trees also covered

in the paper

7

Page 8: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Bounding volume hierarchy

8

Page 9: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Bounding volume hierarchy

?

9

Page 10: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

LBVH - Lauterbach et al. [2009]

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

p

px = 0.py = 0.pz = 0.

1 0 1 00 1 1 11 1 0 0

10

Page 11: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

01

0

LBVH - Lauterbach et al. [2009]

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

p

px = 0.py = 0.pz = 0.

1 1 00 1 11 1 0

1 0 1 00 1 1 1

1 1 0 0

11

Page 12: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

LBVH - Lauterbach et al. [2009]

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

p

px = 0.py = 0.pz = 0.

1 0 1 00 1 1 1

1 1 0 01 0 1 01 1 0 0code =

12

Page 13: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

LBVH - Lauterbach et al. [2009]

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

13

Page 14: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

LBVH - Lauterbach et al. [2009]

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

14

Page 15: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Binary radix tree

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

15

Page 16: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Binary radix tree

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

000

16

Page 17: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Binary radix tree

00010

00100

00101

10011

11000

11001

11110

00001

2 3 4 5 6 7

00

000

000

0 1

17

Page 18: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Binary radix tree

00010

00100

00101

10011

11000

11001

11110

00001

4

00

000

0 1

0010

2 3

1

11

7

1100

5 6

18

Page 19: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Binary radix tree

1

11

00

1100

0010

000

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

n

n-1

19

Page 20: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Longest common prefix

1

11

00

1100

0010

000

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

20

Page 21: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Longest common prefix

00

00001

000010

100100

200101

310011

411001

611110

711000

5

δ(5,6) = 4

δ(5,6) = 4

41100

11000

511001

6

4

21

Page 22: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Longest common prefix

200

00001

000010

100100

200101

310011

411001

611110

711000

5

δ(5,6) = 4

1100

δ(0,3) = ?

00001

000101

3

δ(0,3) = 2δ(0,3) = 2

2

4

22

Page 23: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Garanzha et al. [2011]

Level 3 1 node

Level 2 3 nodes

Level 0 1 node

00010

00001

00100

00101

10011

11000

11001

11110

Level 1 2 nodes1 2

0

4 5 3

0 1 2 3 4 5 6 7

6

23

Page 24: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

00010

00001

00100

00101

10011

11000

11001

11110

0 1 2 3 4 5 6 7

?24

Page 25: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys

25

Page 26: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys

Find the children of a given node Only look at node index and nearby keys

26

Page 27: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

Define a numbering scheme for the nodes Gain some knowledge of their identity Establish a connection with the keys

Find the children of a given node Only look at node index and nearby keys

Do this for all nodes in parallel27

Page 28: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Numbering scheme

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

0

3 4

28

Page 29: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Numbering scheme

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

0

3 4

5

29

Page 30: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Numbering scheme

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

1

6

1

6

3 3

0 0

44

5522

30

Page 31: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Numbering scheme

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

0

3 4

1 5

6

2

31

Page 32: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

0

3 4

1 5

6

2

32

Page 33: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

δ(2,3) = 4

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

?00100

200101

300101

310011

4

δ(3,4) = 03

δ(2,3) = 4

33

Page 34: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

00101

310011

4

δ(3,4) = 03

δ(2,3) = 4

?δ(1,3) = 2 δ(0,3) = 2

34

Page 35: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

3

δ(0,3) = 2

? δ(2,3) = 4 δ(1,3) = 2

35

Page 36: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

3

δ(0,3) = 2

2 2

1 2

36

Page 37: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

For each node i=0..n-2 in parallel:1. Determine direction of the range2. Expand the range as far as possible3. Find where to split the range4. Identify children

Algorithm

Binary search

37

Page 38: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Duplicate keys

The algorithm only works with unique keys Duplicates are common in practice

38

Page 39: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Duplicate keys

The algorithm only works with unique keys Duplicates are common in practice

Trick: Augment each key with its index Distinguishes between duplicates Keys are still in lexicographical order

39

Page 40: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Duplicate keys

The algorithm only works with unique keys Duplicates are common in practice

Trick: Augment each key with its index Distinguishes between duplicates Keys are still in lexicographical order

Tie-break when evaluating δ(i,j)40

Page 41: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

LBVH

1. Assign Morton codes2. Sort primitives3. Generate hierarchy4. Fit bounding boxes

41

Page 42: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Lauterbach et al. [2009]

42

Page 43: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

Need a different approach How many levels are there? Which nodes are located on a given level?

43

Page 44: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

Need a different approach How many levels are there? Which nodes are located on a given level?

Traverse paths in the tree in parallel Start from leaves, advance toward the root Terminate threads using per-node atomic flags

44

Page 45: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Our method

45

Page 46: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes

46

Page 47: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes

Compare against Garanzha et al. [2011] Identical tree (top-level SAH splits disabled)

47

Page 48: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

Evaluate performance on GTX 480 (Fermi) CUDA, 30-bit Morton codes

Compare against Garanzha et al. [2011] Identical tree (top-level SAH splits disabled)

Simulate large GPUs N times as many cores N times the memory bandwidth 48

Page 49: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

49

Our method Garanzha et al.

Fairy Forest174K triangles

milliseconds

Morton Sort Build AABB Morton Sort Build AABB

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Morton Sort Build AABB Morton Sort Build AABB

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2 ×1 × cores

4 ×

Page 50: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

50

Our method Garanzha et al.

Fairy Forest174K triangles

milliseconds

Morton Sort Build AABB Morton Sort Build AABB

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Morton Sort Build AABB Morton Sort Build AABB

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

1.7 ×1.1 ×

12.5 ×

33.6 ×

1.3 ×

2.4 ×

Page 51: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Results

51

Our method

Turbine Blade1.77M triangles

milliseconds

Garanzha et al.

Morton Sort Build AABB Morton Sort Build AABB

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Morton Sort Build AABB Morton Sort Build AABB

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

3.7 ×

7.0 ×

0.8 ×

1.0 ×

Page 52: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

11.625

2.252.875 3.5

4.1254.75

5.375 66.625

7.257.875

0

1

2

3

4

5

6

Results

11.625

2.252.875 3.5

4.1254.75

5.375 66.625

7.257.875

0

1

2

3

4

5

6

Garanzha et al.

52

Stanford Dragon871K triangles

Morton

Sort

Build

AABB

milliseconds

Our method

milliseconds

Page 53: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Acknowledgements

Timo Aila Samuli Laine David Luebke Jacopo Pantaleoni Jaakko Lehtinen

For helpful suggestions and proofreading.

53

Page 54: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Thank You

Questions

Page 55: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Backup slides

Page 56: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Pseudocode

56

Page 57: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

ResultsCommon Build AABB

Scene Cores Eval Sort Our Prev Our Prev

Fairy Forest(174K tris)

1x 0.05 0.56 0.15 1.88 0.23 0.292x 0.03 0.33 0.09 1.75 0.13 0.224x 0.02 0.30 0.05 1.68 0.08 0.19

Conference Room(283K tris)

1x 0.08 0.78 0.24 1.93 0.35 0.382x 0.04 0.51 0.13 1.72 0.19 0.264x 0.03 0.31 0.08 1.58 0.12 0.20

Stanford Dragon(871K tris)

1x 0.22 1.67 0.65 3.14 1.10 1.032x 0.12 1.09 0.34 2.38 0.57 0.604x 0.06 0.64 0.18 1.97 0.30 0.38

Turbine Blade(1.77M tris)

1x 0.45 2.73 1.28 4.73 2.10 1.772x 0.23 1.63 0.65 3.19 1.07 0.964x 0.12 1.08 0.34 2.37 0.56 0.55

57

Page 58: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00010

00100

00101

10011

11000

11001

11110

00001

0 1 2 3 4 5 6 7

0

3 4

1 2 5

6

58

Page 59: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

?20

0010

100100

200100

200101

3

δ(1,2) = 2 δ(2,3) = 4 δ(2,3) = 4

59

Page 60: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

δ(2,4) = 0

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

?00010

100100

2 δ(2,3) = 4δ(1,2) = 2

2

60

Page 61: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

δ(2,3) = 4

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

2

? δ(2,3) = 4

61

Page 62: Maximizing Parallelism in the Construction of BVHs,  Octrees , and  k -d Trees

δ(2,3) = 4

Algorithm

00001

000010

100100

200101

310011

411000

511001

611110

7

2

1 1

62