introduction to direct 3d 12 by ivan nevraev

43

Click here to load reader

Upload: amd-developer-central

Post on 16-Apr-2017

6.250 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Introduction to Direct 3D 12 by Ivan Nevraev

Ivan NevraevMicrosoft

Introduction to Direct3D 12

Page 2: Introduction to Direct 3D 12 by Ivan Nevraev

Goals & Assumptions• Preview of Direct3D 12• More API details in future talks• Assuming familiarity with Direct3D 11

Page 3: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12 API – Goals• Console API efficiency and performance• Reduce CPU overhead• Increase scalability across multiple CPU cores• Greater developer control• Superset of D3D 11 rendering functionality

Page 4: Introduction to Direct 3D 12 by Ivan Nevraev

ID3D11DeviceContext

Render Context: Direct3D 11

Input Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Other State

Page 5: Introduction to Direct 3D 12 by Ivan Nevraev

CPU Overhead: Changing Pipeline State• Direct3D 10 reduced number of state objects• Still mismatched from hardware state• Drivers resolve state at Draw

Page 6: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 11 – Pipeline State Overhead

Small state objects Hardware mismatch overhead

HW State 1

HW State 2

D3D Vertex Shader

D3D Rasterizer

D3D Pixel Shader

D3D Blend StateHW State 3

Page 7: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12 – Pipeline State Optimization

Group pipeline into single objectCopy from PSO to Hardware State

HW State 1

HW State 2

PipelineState

ObjectHW State 3

Page 8: Introduction to Direct 3D 12 by Ivan Nevraev

ID3D11DeviceContext

Render Context: Direct3D 11

Input Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 9: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Pipeline State Object (PSO)

Pipeline State ObjectInput Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 10: Introduction to Direct 3D 12 by Ivan Nevraev

CPU Overhead: Resource Binding• System needs to do lots of binding inspection• Resource hazards• Resource lifetime• Resource residency management

• Mirrored copies of state used to implement Get*• Ease of use for middleware

Page 11: Introduction to Direct 3D 12 by Ivan Nevraev

Resource Hazard Resolution• Hazard tracking and resolution• Runtime• Driver

• Resource hazards• Render Target/Depth <> Texture• Tile Resource Aliasing• etc…

Page 12: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12 – Explicit Hazard ResolutionResourceBarrier: generalization of Direct3D 11’s TiledResourceBarrier

D3D12_RESOURCE_BARRIER_DESC Desc;Desc.Type = D3D12_RESOURCE_BARRIER_TYPE_TRANSITION;Desc.Transition.pResource = pRTTexture;Desc.Transition.Subresource = D3D12_RESOURCE_BARRIER_ALL_SUBRESOURCES;Desc.Transition.StateBefore = D3D12_RESOURCE_USAGE_RENDER_TARGET;Desc.Transition.StateAfter = D3D12_RESOURCE_USAGE_PIXEL_SHADER_RESOURCE;pContext->ResourceBarrier( 1, &Desc );

Page 13: Introduction to Direct 3D 12 by Ivan Nevraev

Resource Lifetime and Residency • Explicit application control over resource lifetime• Resource destruction is immediate• Application must ensure no queued GPU work• Use Fence API to track GPU progress• One fence per-frame is well amortized

• Explicit application control over resource residency• Application declares resources currently in use on GPU

Page 14: Introduction to Direct 3D 12 by Ivan Nevraev

Remove State Mirroring• Application responsibility to communicate current state to

middleware

Page 15: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Pipeline State Object (PSO)

Pipeline State ObjectInput Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 16: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Remove State Reflection

Pipeline State ObjectInput Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 17: Introduction to Direct 3D 12 by Ivan Nevraev

CPU Overhead: Redundant Resource Binding• Streaming identical resource bindings frame over frame• Partial changes require copying all bindings

Page 18: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12: Descriptor Heaps & Tables• Scales across extremes of HW capability• Unified approach serves breadth of app binding flows• Streaming changes to bindings• Reuse of static bindings• And everything between

• Dynamic indexing of shader resources

Page 19: Introduction to Direct 3D 12 by Ivan Nevraev

Descriptor• Small chunk of data defining resource parameters• Just opaque data – no OS lifetime management• Hardware representation of Direct3D “View”

Descriptor { Type Format Mip Count pData }

Texture

Page 20: Introduction to Direct 3D 12 by Ivan Nevraev

Descriptor Heaps• Storage for descriptors• App owns the layout• Low overhead to manipulate• Multiple heaps allowed

GPU Memory

Desc

ripto

r Hea

p

Page 21: Introduction to Direct 3D 12 by Ivan Nevraev

Descriptor Tables• Context points to active heap• A table is an index and a size in the heap• Not an API object• Single view type per table• Multiple tables per type

Pipeline State Object…

Vertex Shader

Pixel Shader

Start IndexSize

Page 22: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Remove State Reflection

Pipeline State ObjectInput Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 23: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Descriptor Tables & Heaps

Pipeline State ObjectInput Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 24: Introduction to Direct 3D 12 by Ivan Nevraev

Render Context: Direct3D 12Pipeline State Object

Input Assembler

Vertex Shader

Hull Shader

Tessellator

Rasterizer

Domain Shader

Geometry Shader

Pixel Shader

Output Merger

GPU Memory

Non-PSO State

Page 25: Introduction to Direct 3D 12 by Ivan Nevraev

CPU Overhead: Redundant Render Commands• Typical applications send identical sequences of commands frame-

over-frame• Measured 90-95% coherence on typical modern games

Page 26: Introduction to Direct 3D 12 by Ivan Nevraev

Bundles• Small command list• Recorded once• Reused multiple times

• Free threaded creation• Inherits from execute site• Non-PSO State• Descriptor Table Bindings

• Restrictions to ensure efficient driver implementation

Page 27: Introduction to Direct 3D 12 by Ivan Nevraev

Bundles

Context

ClearDrawSetTableExecute BundleSetTableExecute BundleSetPSO…

SetP

SODraw

SetP

SOSe

tTable

Dispatc

h

SetP

SOSe

tTable

DrawSe

tPSO

Draw

Page 28: Introduction to Direct 3D 12 by Ivan Nevraev

Example code without Bundles// Setup

pContext->SetPipelineState(pPSO);

pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);

pContext->SetVertexBufferTable(0, 1);

pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

// Draw 1

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->DrawInstanced(6, 1, 0, 0);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->DrawInstanced(6, 1, 6, 0);

// Draw 2

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->DrawInstanced(6, 1, 0, 0);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->DrawInstanced(6, 1, 6, 0);

Set object #1 specific tables and draw

Setup pipeline state and common descriptor tables

Set object #2 specific tables and draw

Page 29: Introduction to Direct 3D 12 by Ivan Nevraev

Bundles – Creating a Bundle// Create bundle

pDevice->CreateCommandList(D3D12_COMMAND_LIST_TYPE_BUNDLE, pBundleAllocator, pPSO, pDescriptorHeap, &pBundle);

// Record commands

pBundle->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pBundle->DrawInstanced(6, 1, 0, 0);

pBundle->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pBundle->DrawInstanced(6, 1, 6, 0);

pBundle->Close();

Page 30: Introduction to Direct 3D 12 by Ivan Nevraev

No Bundles// Setup

pContext->SetPipelineState(pPSO);

pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);

pContext->SetVertexBufferTable(0, 1);

pContext->IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

// Draw 1

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->DrawInstanced(6, 1, 0, 0);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->DrawInstanced(6, 1, 6, 0);

// Draw 2

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->DrawInstanced(6, 1, 0, 0);

pContext->SetShaderResourceViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->DrawInstanced(6, 1, 6, 0);

// Setup

pContext->SetRenderTargetViewTable(0, 1, FALSE, 0);

pContext->SetVertexBufferTable(0, 1);

// Draw 1 and 2

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 0, 1);

pContext->ExecuteBundle(pBundle);

pContext->SetConstantBufferViewTable(D3D12_SHADER_STAGE_PIXEL, 1, 1);

pContext->ExecuteBundle(pBundle);

Bundles

Page 31: Introduction to Direct 3D 12 by Ivan Nevraev

Bundles: CPU performance improvements• PC – 0.7ms to 0.2ms in a simple test (GPU bound)• Xbox• 1/3 CPU consumption for rendering submission in one game• 100s of thousand DrawBundle executions are possible per 60FPS frame

• Even one draw per draw bundle helps• Saves engine overhead

Page 32: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12 – Command Creation Parallelism• About that context…

• No Immediate Context• All rendering via Command Lists• Command Lists are submitted on a Command Queue

Page 33: Introduction to Direct 3D 12 by Ivan Nevraev

Command Lists and Command Queue• Application responsible for• Hazard tracking• Declaring maximum number of recording command lists • Resource renaming with GPU signaled fence• Resources lifetime referenced by command lists

• Fence operations on the Command Queue• Not on Command List or Bundle• Signals occur on Command List completion

• Command List submission cost reduced by WDDM 2.0

Page 34: Introduction to Direct 3D 12 by Ivan Nevraev

Command Queue

Command Queue

Execute Command List 1Execute Command List 2Signal Fence

Command List 1ClearSetTableExecute Bundle ASetTableDrawSetPSODraw

Command List 2ClearDispatchSetTableExecute Bundle ASetTableExecute Bundle B

SetP

SODraw

SetP

SOSe

tTable

Dispatc

h

SetP

SOSe

tTable

DrawSe

tPSO

Draw

Page 35: Introduction to Direct 3D 12 by Ivan Nevraev

Command Queue

Command Queue

Execute Command List 1Execute Command List 2Signal Fence

Command List 1ClearSetTableExecute Bundle ASetTableDrawSetPSODraw

Command List 2ClearDispatchSetTableExecute Bundle ASetTableExecute Bundle B

SetP

SODraw

SetP

SOSe

tTable

Dispatc

h

SetP

SOSe

tTable

DrawSe

tPSO

Draw

Page 36: Introduction to Direct 3D 12 by Ivan Nevraev

Dynamic Heaps• Resource Renaming Overhead• Significant CPU overhead on ExecuteCommandList• Significant driver complexity

• Solution: Efficient Application Suballocation• Application creates large buffer resource and suballocates• Data type determined by application• Standardized alignment requirements• Persistently mapped memory

Page 37: Introduction to Direct 3D 12 by Ivan Nevraev

Allocation vs. Suballocation

GPU Memory Resource 2Resource 1Heap

CB IB VB …

GPU Memory Resource 2Resource 1

CB IB VB

Page 38: Introduction to Direct 3D 12 by Ivan Nevraev

Direct3D 12 – CPU Parallelism• Direct3D 12 has several parallel tasks• Command List Generation• Bundle Generation• PSO Creation• Resource Creation• Dynamic Data Generation

• Runtime and driver designed for parallelism• Developer chooses what to make parallel

Page 39: Introduction to Direct 3D 12 by Ivan Nevraev

D3D11 ProfilingPresentApp Logic D3D11 UMD KMDDXGK

App Logic D3D11

App Logic D3D11

App Logic D3D11

Thread 0

Thread 1

Thread 2

Thread 3

0 ms 2.50 ms 5.00 ms 7.50 ms

App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver Present

Page 40: Introduction to Direct 3D 12 by Ivan Nevraev

D3D12 Profiling

App Logic UMDD3D

12 Pres

entDX

GK/

KMD

App Logic UMDD3D

12App Logic UMDD3

D12

App Logic UMDD3D

12

Thread 0

Thread 1

Thread 2

Thread 3

0 ms 2.50 ms 5.00 ms 7.50 ms

App Logic D3D Runtime User-mode Driver DXGKernel Kernel-mode Driver Present

Page 41: Introduction to Direct 3D 12 by Ivan Nevraev

D3D11 v D3D12 numbers

App Logic UMDD3 D1 2 Pre

se ntDXG

K/ KMD

App Logic UMDD3 D1 2

App Logic UMDD3 D1 2

App Logic UMDD3 D1 2

Thread 0

Thread 1

Thread 2

Thread 3

0 ms 2.50 ms 5.00 ms 7.50 ms

PresentApp Logic D3D11 UMD KMDDXGK

App Logic D 3 D 1 1

App Logic D 3 D 1 1

App Logic D 3 D 1 1

Thread 0

Thread 1

Thread 2

Thread 3

0 ms 2.50 ms 5.00 ms 7.50 ms

App+GFX (ms) GFX-only (ms)

D3D11 D3D12 D3D11 D3D12

Thread 0 7.88 3.80 5.73 1.17

Thread 1 3.08 2.50 0.35 0.81

Thread 2 2.84 2.46 0.34 0.69

Thread 3 2.63 2.45 0.23 0.65

Total 16.42 11.21 6.65 3.32

Page 42: Introduction to Direct 3D 12 by Ivan Nevraev

Summary• Greater CPU Efficiency• Greater CPU Scalability• Greater Developer Control• CPU Parallelism• Resource Lifetime• Memory Usage

Page 43: Introduction to Direct 3D 12 by Ivan Nevraev

The End