opencl accelerated rigid body and collision detectiontrink/rss-2011/presentations/coumans.pdf ·...

31
OpenCL accelerated rigid body and collision detection Erwin Coumans Advanced Micro Devices Robotics: Science and Systems Conference 2011

Upload: others

Post on 17-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

OpenCL accelerated rigid body and collision detection

Erwin Coumans

Advanced Micro Devices

Robotics: Science and Systems Conference 2011

Page 2: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Overview

• Intro

• GPU broadphase acceleration structures

• GPU convex contact generation and reduction

• GPU BVH acceleration for concave shapes

• GPU constraint solver

Robotics: Science and Systems Conference 2011

Page 3: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Industry view

Game and Film physics Simulation

Hardware Platform

Games Industry

Movie Industry

Academia, Universities

Content Creation Tools

Conferences, Presentations

C++ APIs and Implementations

Data representation

Sony Imageworks

PDI Dreamworks

ILM, Disney

Rockstar, Epic

EA, Disney

PS3, Xbox 360, PC, iPhone, Android

OpenCL, CUDA

Havok, PhysX Bullet, ODE,

Newton, PhysBAM, Box2D

Custom in-house physics engines

FBX, COLLADA

binary .hkx, .bullet format

Maya, 3ds Max, Houdini, LW, Cinema 4D

Blender

GDC, SIGGRAPH Stanford, UNC

etc.

x86, PowerPC, Cell, ARM

Robotics: Science and Systems Conference 2011

Page 4: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Our open source work

• Bullet Physics SDK, http://bulletphysics.org

• Sony Computer Entertainment Physics Effects

• OpenCL/DirectX11 GPU physics research

Robotics: Science and Systems Conference 2011

Page 5: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

OpenCL™

• Open development platform for multi-vendor heterogeneous architectures

• The power of AMD Fusion: Leverages CPUs and GPUs for balanced system approach

• Broad industry support: Created by architects from AMD, Apple, IBM, Intel, NVIDIA, Sony, etc. AMD is the first company to provide a complete OpenCL solution

• Kernels written in subset of C99

Robotics: Science and Systems Conference 2011

Page 6: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Particle

State vector

Rigid body

xX

v

x

qX

v

/

vX

F m

1

2

/

( ) /

v

qX

F m

I I

Robotics: Science and Systems Conference

2011

Page 7: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Rigid body simulation loop

First update the linear and angular velocity

Then the position

1

1

1

t h t

t t

v v hFm

h I

1

2

t h t t h

t h t t h t

x x hv

q q h q

Robotics: Science and Systems Conference

2011

Page 8: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

OpenCL kernel: Integration

__kernel void

interopKernel( const int startOffset, const int numNodes, __global float4

*g_vertexBuffer,__global float4 *linVel,__global float4 *pAngVel)

{

int nodeID = get_global_id(0);

float timeStep = 0.0166666;

if( nodeID < numNodes )

{

g_vertexBuffer[nodeID + startOffset/4] += linVel[nodeID]*timeStep;

float4 axis;

float4 angvel = pAngVel[nodeID];

float fAngle = native_sqrt(dot(angvel, angvel));

axis = angvel * ( native_sin(0.5f * fAngle * timeStep) / fAngle);

float4 dorn = axis;

dorn.w = native_cos(fAngle * timeStep * 0.5f);

float4 orn0 = g_vertexBuffer[nodeID + startOffset/4+numNodes];

float4 predictedOrn = quatMult(dorn, orn0);

predictedOrn = quatNorm(predictedOrn);

g_vertexBuffer[nodeID + startOffset/4+numNodes]=predictedOrn;

}

}

Robotics: Science and Systems Conference 2011

Page 9: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

OpenCL – OpenGL interop

• See http://github.com/erwincoumans/experiments

Robotics: Science and Systems Conference

2011

Page 10: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Today’s execution model

• Single Program Multiple Data (SPMD)

• Same kernel runs on: • All compute units

• All processing elements

• Purely “data parallel” mode

• PCIe bus between CPU and GPU is a bottleneck

Robotics: Science and Systems Conference 2011

Page 11: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Tomorrow’s execution model

• Multiple Program Multiple Data (SPMD)

• Nested data parallelism

– Kernels can enqueue work

• CPU and GPU on one die

– Unified memory

spawn

WG

WI

WI WI

WI

WG

WI

WI WI

WI

WG

WI

WI WI

WI

spawn

WG

WI

WI WI

WI

Robotics: Science and Systems Conference 2011

Page 12: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Physics Pipeline

Forward Dynamics Computation

Collision Detection Computation

Detect

pairs

Forward Dynamics Computation

Setup

constraints

Solve

constraints

Integrate

position

Apply

gravity

Collision Data Dynamics Data

Compute

AABBs

Predict

transforms

Collision

shapes

Object

AABBs

Overlapping

pairs

World

transforms

velocities

Mass

Inertia

Constraints

(contacts,

joints)

Compute

contact

points

Contact

points

time Start End

AABB = axis aligned bounding box Robotics: Science and Systems Conference

2011

Page 13: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Broadphase N-body problem

• Avoid brute-force N*N tests

• Input: world space BVs and unique IDs

• Output: array of potential overlapping pairs

Broadphase Collision Detection

Detect

pairs

Compute

AABBs

Robotics: Science and Systems Conference 2011

Page 14: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Uniform grid

• Very GPU friendly, parallel radix sort

• Use modulo to make grid unbounded

Robotics: Science and Systems Conference 2011

Page 15: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Non uniform work granularity

• Small versus small

– GPU

• Large versus small

– CPU

• Large versus large

– CPU

Robotics: Science and Systems Conference 2011

Page 16: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Incremental sweep and prune

• Update 3 sorted axis and overlapping pairs

• Does’t parallelize easy: data dependencies

Robotics: Science and Systems Conference 2011

Page 17: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Parallel 1 axis sweep and prune

• From scratch sort 1 axis sweep to find all pairs

• Parallel radix sort and parallel sweep

[Game Physics Pearls, 2010, AK Peters]

Robotics: Science and Systems Conference 2011

Page 18: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Dynamic BVH tree broadphase

• Keep two dynamic trees, one for moving objects, other for objects (sleeping/static)

• Find neighbor pairs:

– Overlap M versus M and Overlap M versus S

D F

B C E A

Q P

U R S T

S: Non-moving DBVT M: Moving DBVT

Robotics: Science and Systems Conference 2011

Page 19: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Update/move a leaf node

• If new AABB is contained by old do nothing

• Otherwise remove and re-insert leaf – Re-insert at closest ancestor that was not resized

during remove

• Expand AABB with margin – Avoid updates due to jitter or small random

motion

• Expand AABB with velocity – Handle the case of linear motion over n frames

Robotics: Science and Systems Conference

2011

Page 20: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Parallel BVH tree traversal

• Incremental update on CPU (shared memory)

• Use parallel history traversal on GPU

00 00

00

00

10 10

00

00

10 10

00

00

10 11

00

00

10 00

00

00

10 11

00

00

10 00

00

00

11 00

00

00

Robotics: Science and Systems Conference 2011

Page 21: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Contact generation

• Input: overlapping pairs, collision shapes

• Output: contact points

Robotics: Science and Systems Conference 2011

Page 22: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Closest point algorithms

• Algebraic closed forms, ie. sphere-sphere

• Separating axis theorem for convex polyhedra • Only computes overlap, no positive distances

• Gilbert Johnson Kheerthi (GJK) for general convex objects

• Doesn’t compute overlap, needs a companion algorithm for penetrating case, for example the Expanding Polytope Algorithm

Robotics: Science and Systems Conference 2011

Page 23: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Separating axis test

• Test all axis in parallel, use cube map

Robotics: Science and Systems Conference 2011

Page 24: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

GJK or EPA

Robotics: Science and Systems Conference 2011

Page 25: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Sutherland–Hodgman clipping

• Clip incident face against reference face side planes (but not the reference face).

• Consider clip points with positive penetration.

n

clipping planes

Robotics: Science and Systems Conference 2011

Page 26: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

• GPU friendly convex versus convex

• See “Rigid body collision detection on the GPU” by Rahul Sathe et al, SIGGRAPH 2006 poster

Cube map

Robotics: Science and Systems Conference 2011

Page 27: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Contact reduction

• Keep only 4 points

Robotics: Science and Systems Conference 2011

Page 28: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Concave shapes

• Hierarchical approximate convex decomposition (ICIP 2009 proceedings, Khaled Mamou)

• http://sourceforge.net/projects/hacd

Robotics: Science and Systems Conference 2011

Page 29: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Parallelizing Constraint Solver

• Projected Gauss Seidel iterations are not embarrassingly parallel

A

B D

1 4

Robotics: Science and Systems Conference 2011

Page 30: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Reordering constraint batches

A

B D

1 4

A B C D

1 1

2 2

3 3

4 4

A B C D

1 1 3 3

4 2 2 4

Robotics: Science and Systems Conference 2011

Page 31: OpenCL accelerated rigid body and collision detectiontrink/RSS-2011/Presentations/coumans.pdf · Overview •Intro •GPU broadphase acceleration structures ... COLLADA binary .hkx,

Thanks!

• For more information: http://bulletphysics.org

[email protected]

Robotics: Science and Systems Conference 2011