gs-4147, tressfx 2.0, by bill-bilodeau

44
TressFX 2.0 AND BEYOND BILL BILODEAU, AMD DONGSOO HAN, AMD

Upload: amd-developer-central

Post on 05-Dec-2014

1.803 views

Category:

Technology


1 download

DESCRIPTION

Presentation GS-4147 by Bill Bilodeau at the AMD Developer Summit (APU13) November 11-13, 2013

TRANSCRIPT

Page 1: GS-4147, TressFX 2.0, by Bill-Bilodeau

TressFX 2.0 AND BEYOND BILL BILODEAU, AMD

DONGSOO HAN, AMD

Page 2: GS-4147, TressFX 2.0, by Bill-Bilodeau

2 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 AND BEYOND

TressFX Overview

TressFX Rendering

‒ TressFX 2.0 improvements

TressFX Physics

Future Work

AGENDA

Page 3: GS-4147, TressFX 2.0, by Bill-Bilodeau

TressFX OVERVIEW

Page 4: GS-4147, TressFX 2.0, by Bill-Bilodeau

4 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX OVERVIEW

Realistic hair rendering and simulation

‒ Used in Tomb Raider

Goes beyond simple shells and fins representation used in games

Hair is rendered as thousands of strands with self shadowing, antialiasing and transparency

Physical simulation for each strand using GPU compute shaders

Very flexible to allow for different hair styles and different conditions

WHAT IS TressFX?

Page 5: GS-4147, TressFX 2.0, by Bill-Bilodeau

5 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX RENDERING

What goes into good hair?

‒ Anti-aliasing

‒ Volumetric self shadowing

‒ Transparency

WHAT MAKES IT LOOK GOOD

Basic Rendering Antialiasing Antialiasing

+ Self Shadowing

Antialiasing

+ Self Shadowing

+ Transparency

Page 6: GS-4147, TressFX 2.0, by Bill-Bilodeau

TressFX RENDERING

Page 7: GS-4147, TressFX 2.0, by Bill-Bilodeau

7 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX RENDERING

Kajiya-Kay Hair Lighting Model

‒ Anisotropic hair strand lighting model

‒ Uses the tangent along the strand instead of the normal for light reflections

‒ Instead of cos(N, H) , use sin(T,H)

Marschner Model

‒ Two specular highlights

‒ Primary light colored highlight shifted towards the tip

‒ Secondary hair colored highlight shifted towards the root

‒ TressFX uses an approximation of the Marchner technique when rendering two highlights

LIGHTING MODEL

Primary Highlights

Secondary Highlights

Page 8: GS-4147, TressFX 2.0, by Bill-Bilodeau

8 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX RENDERING

Every hair strand is anti-aliased manually

‒ Not using Hardware MSAA!

Compute pixel coverage on edges of hair strands and convert it to an alpha value

ANTI-ALIASING

Page 9: GS-4147, TressFX 2.0, by Bill-Bilodeau

9 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX RENDERING

Self Shadowing

‒ Uses a simplified Deep Shadow Map technique

SELF SHADOWING

No Self Shadows With Self Shadows

Page 10: GS-4147, TressFX 2.0, by Bill-Bilodeau

10 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX RENDERING

Order Independent Transparency (OIT) using a Per-Pixel Linked Lists (PPLL)

Fragments are stored in link lists on the GPU

Nearest K fragments are rendered in back to front order

TRANSPARENCY

No Transparency With Transparency

Page 11: GS-4147, TressFX 2.0, by Bill-Bilodeau

11 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 RENDERING

TressFX 1.0 Rendering

‒ Render hair strand geometry into A-buffer

‒ Do lighting, shadowing, and antialiasing

‒ Store fragment color with depth and coverage in per-pixel linked list (PPLL)

‒ Render the K nearest fragments (K-buffer) in back to front order

‒ Blend nearest K fragments in the correct order with transparency

‒ Blend the remaining fragments without sorting

How rendering was done in version 1.0

Page 12: GS-4147, TressFX 2.0, by Bill-Bilodeau

12 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 RENDERING A-BUFFER PASS

Hair Geometry

Vertex Shader Pixel Shader

Head UAV

PPLL UAV

Coverage

Lighting

Shadows depth

color

coverage

next

Page 13: GS-4147, TressFX 2.0, by Bill-Bilodeau

13 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 RENDERING

GPU implementation of order independent transparency (OIT)

Head UAV

‒ Each pixel location has a “head pointer” to a linked list in the PPLL UAV

PPLL UAV

‒ As new fragments are rendered, they are added to the next open location in the PPLL (using UAV counter)

‒ A link is created to the fragment pointed to by the head pointer

‒ Head pointer then points to the new fragment

PER-PIXEL LINKED LIST

Head UAV

PPLL UAV // Retrieve current pixel count and increase counter

uint uPixelCount = LinkedListUAV.IncrementCounter();

uint uOldStartOffset;

// Exchange indices in LinkedListHead texture corresponding to pixel location

InterlockedExchange(LinkedListHeadUAV[address], uPixelCount, uOldStartOffset);

// Append new element at the end of the Fragment and Link Buffer

Element.uNext = uOldStartOffset;

LinkedListUAV[uPixelCount] = Element;

Page 14: GS-4147, TressFX 2.0, by Bill-Bilodeau

14 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 RENDERING K-BUFFER PASS

Full Screen Quad

Vertex Shader Pixel Shader

depth

color

coverage

depth

color

coverage

depth

color

coverage

depth

color

coverage

K-Buffer

Transparency

Page 15: GS-4147, TressFX 2.0, by Bill-Bilodeau

15 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 RENDERING

Observation

‒ All fragments are lit and shadowed equally

‒ Even the ones buried under dozens of hair fragments that you can’t see

Solution

‒ Defer the lighting and shadowing until the k-buffer pass

‒ Render the nearest K fragments with high quality

‒ Render the remaining fragments with lower quality (but faster)

HOW CAN WE MAKE IT FASTER?

Page 16: GS-4147, TressFX 2.0, by Bill-Bilodeau

16 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 RENDERING A-BUFFER PASS

Hair Geometry

Vertex Shader Pixel Shader

Coverage

depth

coverage

tangent

next

Page 17: GS-4147, TressFX 2.0, by Bill-Bilodeau

17 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 RENDERING K-BUFFER PASS

Full Screen Quad

Vertex Shader Pixel Shader

K-Buffer

Lighting Shadows

depth

coverage

tangent

depth

coverage

tangent

depth

coverage

tangent

depth

coverage

tangent

Transparency

Page 18: GS-4147, TressFX 2.0, by Bill-Bilodeau

18 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 IMPROVEMENTS

Distance to camera can be used for reducing the density of the hair ‒ Uniformly remove hair strands from the rendering

‒ To compensate for missing strands, thicken the hair

‒ Adjust the minimum pixel coverage with distance

CONTINUOUS LODs

Full Density Hair Reduced Density Hair Reduced Density with Thicker Strands

Page 19: GS-4147, TressFX 2.0, by Bill-Bilodeau

19 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 IMPROVEMENTS

TressFX11 Sample Code is much more modular

All of the necessary TressFX code in separate files for

‒ Rendering

‒ Simulation

‒ Mesh management

‒ Asset loading

Code for head rendering and sample framework are completely separate

‒ Take the “TressFX” files to get just what you need

Better variable names

Removal of dead code

CODE RESTRUCTURING

Main

TressFXSimulate TressFXRender SceneRender

TressFXMesh

TressFXAssetLoader

TressFXSimulate TressFXRender SceneRender

Gaussian Filter

DX11Mesh

ObjImport TressFX Code

Page 20: GS-4147, TressFX 2.0, by Bill-Bilodeau

20 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 IMPROVEMENTS

Vertex shader optimizations for rendering

‒ Draw call for hair now uses an index buffer with a triangle list instead of looking up indices from a buffer

PPLL head buffer uses a RWTexture2D for better caching (tiled)

Hair shadow on model is softer and less blocky

Various shader code optimizations

Porting Guide

Download the new TressFX 2.0 sample soon from our Radeon SDK :

http://developer.amd.com

MISCELLANEOUS IMPROVEMENTS

Page 21: GS-4147, TressFX 2.0, by Bill-Bilodeau

21 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 RENDERING

A-Buffer

‒ 2 UAVs

‒ Size determined by resolution

‒ Head of the Linked List UAV

‒ Screen resolution RWTexture2D, DXGI_FORMAT_R32_UINT

‒ Per-Pixel Linked List UAV

‒ Structured Buffer, size = (number of pixels) x (avg hair layers) x (sizeof(LinkedListStructure))

‒ Default average number of hair layers is 8

‒ Linked list structure is currently 3 DWORDs: depth, coverage, tangent

Limited memory, but unbounded linked list

‒ This means too many fragments for a given pixel can overflow the PPLL

‒ Can cause artifacts

‒ Typically this only happens if the camera gets too close

MEMORY CONSIDERATIONS

0.00

50.00

100.00

150.00

200.00

250.00

Total A-BufferMemory (MB)

Linked List Head Per-Pixel Linked List

720p

1080p

Page 22: GS-4147, TressFX 2.0, by Bill-Bilodeau

22 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0 RENDERING PERFORMANCE RESULTS

0

0.5

1

1.5

2

2.5

Total Hair RenderTime (ms)

A-Buffer Pass K-Buffer Pass

TressFX 1.0

TressFX 2.0

0

0.5

1

1.5

2

2.5

3

Total Hair RenderTime (ms)

A-Buffer Pass K-Buffer Pass

TressFX 1.0

TressFX 2.0

R9 290x R9 280x

> 2X performance increase!

Page 23: GS-4147, TressFX 2.0, by Bill-Bilodeau

TressFX SIMULATION

Page 24: GS-4147, TressFX 2.0, by Bill-Bilodeau

24 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 1.0 Simulation Overview

‒ Main Interest

‒ Simulation Overview

‒ Constraints

‒ Global shape constraints

‒ Local Shape Constraints

‒ Edge length constraints

‒ Problems

TressFX Beyond

‒ General Constraint Formulation

‒ Tridiagonal Matrix-free Formulation

‒ Solving Linear System

‒ Benefits

TressFX Simulation Topics

Page 25: GS-4147, TressFX 2.0, by Bill-Bilodeau

25 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

Main Interest

Main interest of TressFX simulation

‒ Performance, performance and performance! – DirectCompute

‒ Styled hair – bending and twisting forces are important

‒ Stability – position based dynamics

- Conditions – wet, dry or heavy

- Wind – helps express dynamics even the character in the idle mode

Page 26: GS-4147, TressFX 2.0, by Bill-Bilodeau

26 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

Simulation Overview

CPU

GPU – DirectCompute

load hair data

precompute rest-state values – can be offline

while simulation running do

apply gravity

integrate

apply GSC (Global Shape Constraints)

apply LSC (Local Shape Constraints)

apply wind

apply ELC (Edge Length Constraints)

collision handling

GPU – Rendering pipeline vertex buffer

Page 27: GS-4147, TressFX 2.0, by Bill-Bilodeau

27 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

GLOBAL SHAPE CONSTRAINTS

GSC(Global Shape Constraints)

‒ The initial positions of particles serve as the global goal positions

‒ The goal positions are rigid w.r.t character head transform.

‒ You can think the initial positions are some cage and vertices are trapped in that cage during simulation.

‒ Easy and cheap. Help maintain the global shape but lose the detailed simulation

initial goal position current position

final position

Page 28: GS-4147, TressFX 2.0, by Bill-Bilodeau

28 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

LOCAL SHAPE CONSTRAINTS

LSC(Local Shape Constraints)

‒ The goal positions are determined in the local frames.

‒ Still the goal positions are transformed in world frames and applied to vertex positions.

initial goal position

current position

final position

Page 29: GS-4147, TressFX 2.0, by Bill-Bilodeau

29 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

LOCAL SHAPE CONSTRAINTS – CONT’

Local Transforms

‒ As in robotic arm, an open-chain structure has joints and each joint has parent-child relationships to its connected joints.

‒ 𝑇 𝑖−1

𝑖 is to transform (translate and rotate) child space(i) to its parent space(i-1)

‒ With local transforms in chain structure, we can get a global transforms.

‒ Local frames should be updated at each particles

𝑇 𝑤

𝑖 = 𝑇 𝑤

0 ∙ 𝑇 0

1 … ∙ 𝑇 𝑖−2

𝑖−1 ∙ 𝑇 𝑖−1

𝑖

Page 30: GS-4147, TressFX 2.0, by Bill-Bilodeau

30 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

LOCAL SHAPE CONSTRAINTS – CONT’

Initialize and update local and global transforms

‒ Initialization is performed in CPU or offline only once.

‒ Update is performed at each frame in GPU.

‒ Update is serial process but independent to other strands. We update multiple strands in massive parallel processes in GPU.

‒ With local and global transforms, we can calculate target vertex positions for local shape constraints.

‒ Finally, update two neighboring vertices to get stable convergence.

i-1

i Computing on local transform

Updating position

Zero

Page 31: GS-4147, TressFX 2.0, by Bill-Bilodeau

31 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

EDGE LENGTH CONSTRAINTS

0.5

how much stretched or compressed unit edge vector

Page 32: GS-4147, TressFX 2.0, by Bill-Bilodeau

32 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

Problems

Extreme acceleration

‒ When character makes a sudden move, it can generate extreme linear and angular acceleration which stretch hair very long.

‒ Even with high iterations with Edge Length Constraints, hair doesn’t recover the original length and as a result, hair can look too stretchy.

‒ Possible solution was to enforce Edge Length Constraints in the serial fashion from the root to the end of hair with extra damping – used for Tomb Raider

‒ We need a better way! And we did research!

Page 33: GS-4147, TressFX 2.0, by Bill-Bilodeau

33 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

Problems EXTREME ACCELERATION

Page 34: GS-4147, TressFX 2.0, by Bill-Bilodeau

Future TressFX Simulation

Page 35: GS-4147, TressFX 2.0, by Bill-Bilodeau

35 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

General Constraint Formulation

Page 36: GS-4147, TressFX 2.0, by Bill-Bilodeau

36 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

Tridiagonal Matrix Formulation

Special Formulation for Chain Structure such as Hair

‒ We don’t want to solve a big matrix equation, especially in GPU!

‒ Let’s take advantage of linear topology and serial indexing

General case. We don’t want this!

Special case. Much simpler!

Known. Easy to compute them.

Unknown and what we are solving for

Page 37: GS-4147, TressFX 2.0, by Bill-Bilodeau

37 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

SOLVING LINEAR SYSTEM

Solving Linear System

‒ The formulation doesn’t require explicit matrix – Good for GPU!

‒ Diagonal, super and sub diagonal elements are non-zero - Sparse!

‒ The equation is diagonally dominant – Good for choice of direct solver!

‒ We can use tridiagonal matrix algorithm (Thomas algorithm)

‒ So we can solve it in GPU!

Page 38: GS-4147, TressFX 2.0, by Bill-Bilodeau

38 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

FUR CASTLE

Page 39: GS-4147, TressFX 2.0, by Bill-Bilodeau

39 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

FUR MUSHROOM

Page 40: GS-4147, TressFX 2.0, by Bill-Bilodeau

40 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

GRASS

Page 41: GS-4147, TressFX 2.0, by Bill-Bilodeau

41 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

BENEFITS

No more iterations for Edge Length Constraints

‒ Needn’t have to guess number of iterations

‒ Fixed computation cost

‒ Fast convergence

Page 42: GS-4147, TressFX 2.0, by Bill-Bilodeau

42 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

TressFX 2.0

TressFX 2.0 performance now makes hair rendering faster than the previous version

‒ More than 2X faster in some cases

TressFX is now fast enough to use on consoles

More modular code structure means easier porting to your game

Realistic physics for hair simulation can now be extended to other objects

Stay tuned for more!

‒ Ongoing research to improve and expand the use of this technology

CONCLUSIONS

Page 43: GS-4147, TressFX 2.0, by Bill-Bilodeau

43 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

REFERENCE

Real-time Hair Simulation with Efficient Hair Style Preservation – Han, et al. VRIPHYS 2012

Tridiagonal Matrix Formulation for Inextensible Hair Strand Simulation – Han, et al. VRIPHYS 2013

Page 44: GS-4147, TressFX 2.0, by Bill-Bilodeau

44 TressFX 2.0 and Beyond NOVEMBER 12, 2013 | AMD DEVELOPER SUMMIT

DISCLAIMER & ATTRIBUTION

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.

The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.

AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.

AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

ATTRIBUTION

© 2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo and combinations thereof are trademarks of Advanced Micro Devices, Inc. in the United States and/or other jurisdictions. SPEC is a registered trademark of the Standard Performance Evaluation Corporation (SPEC). Other names are for informational purposes only and may be trademarks of their respective owners.