working with metal—overview - apple inc. · session 603 jeremy sandmel gpu software graphics and...

200
© 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple. #WWDC14 Working with Metal—Overview Session 603 Jeremy Sandmel GPU Software Graphics and Games

Upload: others

Post on 12-Mar-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

© 2014 Apple Inc. All rights reserved. Redistribution or public display not permitted without written permission from Apple.

#WWDC14

Working with Metal—Overview

Session 603 Jeremy Sandmel GPU Software

Graphics and Games

Page 2: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal

Dramatically reduced overhead

Unified graphics and compute

Precompiled shaders

Efficient multithreading

Designed for A7

Page 3: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Agenda

Background

API concepts

Shading language

Developer tools

Page 4: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Agenda

Background

API concepts

Shading language

Developer tools

Page 5: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

10x more draw calls

Page 6: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Page 7: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Page 8: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

Page 9: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

for the CPU

Page 10: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

Set Shaders

Set Textures

Set Vertex Buffers

Draw #1

Set Shaders

Set Blend

Set Depth Test

Draw #2

Your Application

for the CPU

Page 11: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

Set Shaders

Set Textures

Set Vertex Buffers

Draw #1

Set Shaders

Set Blend

Set Depth Test

Draw #2

Your Application

Hardware commands

for the CPU

Page 12: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

Set Shaders

Set Textures

Set Vertex Buffers

Draw #1

Set Shaders

Set Blend

Set Depth Test

Draw #2

Your Application

Hardware commands

for the CPU

Page 13: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

About Draw Calls…

Each draw call requires its own state vector • Shaders, states, textures, render targets, etc.

Changing state vectors can be expensive • Translation to hardware commands

More draw calls per frame gives you • More unique objects

• More visual variety

• More freedom for game artists and designers

Set Shaders

Set Textures

Set Vertex Buffers

Draw #1

Set Shaders

Set Blend

Set Depth Test

Draw #2

Your Application

Hardware commands

for the CPU

Page 14: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before Metal

Page 15: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before Metal

Long history of GPU programming APIs • Standards—OpenGL, OpenCL

• Domains—High level, low level, 2D, 3D

• Architectures—Platforms, devices, GPUs

Page 16: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before Metal

Long history of GPU programming APIs • Standards—OpenGL, OpenCL

• Domains—High level, low level, 2D, 3D

• Architectures—Platforms, devices, GPUs

Something was missing…

Page 17: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

Page 18: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

+ +

Page 19: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

+ +

Page 20: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

What if we took the same approach for GPU programming?

+ +

Page 21: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

What if we took the same approach for GPU programming?

+ +

Page 22: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Deep Integration

What if we took the same approach for GPU programming?

=+ +

Page 23: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled
Page 24: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Page 25: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Page 26: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Modern GPU features

Page 27: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Modern GPU features

Do expensive tasks less often

Page 28: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Modern GPU features

Do expensive tasks less often

Predictable performance

Page 29: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Modern GPU features

Explicit command submission

Do expensive tasks less often

Predictable performance

Page 30: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Design

Thinnest possible API

Modern GPU features

Explicit command submission

Do expensive tasks less often

Optimized for CPU behavior

Predictable performance

Page 31: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your App

GPU

Page 32: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your App

GPU

SceneKit SpriteKit

Scene Graphs

Page 33: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your App

GPU

SceneKit SpriteKit

Scene Graphs

Core Animation Core Image

Core Graphics

2D Graphics and Imaging

Page 34: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your App

GPU

SceneKit SpriteKit

Scene Graphs

Core Animation Core Image

Core Graphics

2D Graphics and Imaging

OpenGL ES

Standards-Based 3D Graphics

Page 35: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your App

GPU

SceneKit SpriteKit

Scene Graphs

Core Animation Core Image

Core Graphics

2D Graphics and Imaging

OpenGL ES

Standards-Based 3D Graphics

High Efficiency GPU Access

Metal

Page 36: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

So how did we do this?

Page 37: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Frame Times

Many games target frame rate of 30 FPS (33.3 milliseconds/frame)

Page 38: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Frame Times

Many games target frame rate of 30 FPS (33.3 milliseconds/frame)

0 ms 33.3 ms 66.7 ms 100 ms

Page 39: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Frame Times

Many games target frame rate of 30 FPS (33.3 milliseconds/frame)

CPU workframe N

GPU work frame N

GPU workframe N-1

0 ms 33.3 ms 66.7 ms 100 ms

CPU work frame N-1

Page 40: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Frame Times

Many games target frame rate of 30 FPS (33.3 milliseconds/frame)

CPU workframe N

GPU work frame N

CPU work frame N+1

GPU work frame N+1

CPU work frame N+2

GPU workframe N-1

0 ms 33.3 ms 66.7 ms 100 ms

GPU work frame N+2

CPU work frame N-1

Page 41: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

One “Balanced” Frame

CPU

GPU

CPU workframe N

GPU workframe N-1

0 ms 33.3 ms

Page 42: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

CPU Can Take More Time Than GPU

CPU

GPU

CPU workframe N

GPU workframe N-1

0 ms 33.3 ms

Page 43: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

CPU Can Take More Time Than GPU

CPU

GPU

CPU workframe N

GPU workframe N-1

0 ms 33.3 ms

GPU idle time GPU utilization = 67%

Page 44: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

More GPU Work Requires More CPU Work

CPU

GPU

CPU workframe N

GPU workframe N-1

0 ms 33.3 ms

GPU idle time

50 ms

Page 45: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

33.3 ms

CPU Time Includes Application and GPU API

CPU

GPU

CPU workframe N

GPU workframe N-1

0 ms

GPU idle time

GPU API frame N

50 ms

Application frame N

Page 46: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

33.3 ms

Targeting CPU Time Spent in GPU API

CPU

GPU GPU workframe N-1

0 ms

GPU idle time

GPU API frame N

50 ms

Application frame N

Page 47: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Dramatically Reduces GPU API Time

CPU

GPU GPU workframe N-1

0 msGPU API frame N

33.3 ms

CPU idle time

Application frame N

20 ms

Page 48: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Dramatically Reduces GPU API Time

CPU

GPU GPU workframe N-1

0 msGPU API frame N

33.3 ms

CPU idle time

Application frame N

20 ms

Page 49: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Use CPU Time to Improve Your Game

CPU

GPU GPU workframe N-1

0 ms

More PhysicsMore AI

33.3 ms

CPU idle time

Application frame N

20 ms

Page 50: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Use CPU Time to Draw More Objects

CPU

GPU GPU workframe N-1

0 ms

Up to 10x more draw calls

33.3 ms

Application frame N

More PhysicsMore AI

Page 51: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Use CPU Time to Draw More Objects

CPU

GPU GPU workframe N-1

0 ms

Up to 10x more draw calls

33.3 ms

Application frame N

More PhysicsMore AI

Page 52: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Why Is GPU Programming Expensive?

Page 53: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Why Is GPU Programming Expensive?

State validation • Confirming API usage is valid

• Encoding API state to hardware state

Page 54: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Why Is GPU Programming Expensive?

State validation • Confirming API usage is valid

• Encoding API state to hardware state

Shader compilation • Run-time generation of shader machine code

• Interactions between state and shaders

Page 55: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Why Is GPU Programming Expensive?

State validation • Confirming API usage is valid

• Encoding API state to hardware state

Shader compilation • Run-time generation of shader machine code

• Interactions between state and shaders

Sending work to GPU • Managing resource residency

• Batching commands

Page 56: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency

Application build

Content loading

Draw time

Page 57: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency

Application build

Content loading

Draw time

“Never”

Page 58: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency

Application build

Content loading

Draw time

“Never”

Rare

Page 59: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency

Application build

Content loading

Draw time

“Never”

Rare

1000s of times per frame

Page 60: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency Before Metal

Application build

Content loading

Draw time

“Never”

Rare

1000s of times per frame

Shader compilation

State validation

Start work on GPU

Page 61: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Do Expensive Tasks Less Often

When Frequency Before Metal

Application build

Content loading

Draw time

“Never”

Rare

1000s of times per frame

Shader compilation

State validation

Start work on GPU

After Metal

Shader compilation

State validation

Start work on GPU

Page 62: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Agenda

Background

API concepts

Shading language

Developer tools

Page 63: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Page 64: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Page 65: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Page 66: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Command Buffer Contains GPU hardware commands

Page 67: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Command Buffer Contains GPU hardware commands

Command Encoder Translates API commands to GPU hardware commands

Page 68: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Command Buffer Contains GPU hardware commands

Command Encoder Translates API commands to GPU hardware commands

State Framebuffer configuration, blend, depth, samplers, etc.

Page 69: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Command Buffer Contains GPU hardware commands

Command Encoder Translates API commands to GPU hardware commands

State Framebuffer configuration, blend, depth, samplers, etc.

Code Shaders

Page 70: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Objects

Objects Purpose

Device The GPU

Command Queue Serial sequence of command buffers

Command Buffer Contains GPU hardware commands

Command Encoder Translates API commands to GPU hardware commands

State Framebuffer configuration, blend, depth, samplers, etc.

Code Shaders

Resources Textures and Data Buffers (vertices, constants, etc.)

Page 71: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Device

State

Descriptor

System

Code

Resource

Key

Page 72: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

DeviceCommand Queue

State

Descriptor

System

Code

Resource

Key

Page 73: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Buffer

Command Buffer

DeviceCommand Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 74: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Device

Render Command Encoder

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 75: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Device

Render Command Encoder

Data Buffer Resource

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 76: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource Texture

DescriptorTexture

DescriptorTexture

Descriptor

Device

Render Command Encoder

Data Buffer Resource

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 77: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource Texture

DescriptorTexture

DescriptorTexture

Descriptor

Device

Render Command Encoder

Render Pipeline State

Render Pipeline Descriptor

Vertex Descriptor

Function

Function

Blend Descriptor

Data Buffer Resource

Vertex Format

Vertex Function

Fragment Function

Blend

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 78: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource Texture

DescriptorTexture

DescriptorTexture

Descriptor

Device

Render Command Encoder

Render Pipeline State

Render Pipeline Descriptor

Vertex Descriptor

Function

Function

Blend Descriptor

Depth Stencil State

Depth Stencil Descriptor

Data Buffer Resource

Vertex Format

Vertex Function

Fragment Function

Blend

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 79: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Texture DescriptorTexture

Descriptor

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource Texture

DescriptorTexture

DescriptorTexture

Descriptor

Texture DescriptorTexture

Resource

Texture Resource

Attachment Descriptor

Attachment Descriptor

Device

Render Command Encoder

Render Pipeline State

Render Pipeline Descriptor

Vertex Descriptor

Function

Function

Blend Descriptor

Depth Stencil State

Depth Stencil Descriptor

Data Buffer Resource

Render Pass Descriptor

Attachment Descriptor

Texture Resource

Vertex Format

Vertex Function

Fragment Function

Blend

Command Queue

Command Buffer

State

Descriptor

System

Code

Resource

Key

Page 80: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource

Device

Render Command Encoder

Render Pipeline State

Depth Stencil State

Data Buffer Resource

Command Queue

Command Buffer

Texture Resource

Texture Resource

Texture Resource

Page 81: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource

Device

Render Command Encoder

Render Pipeline State

Depth Stencil State

Data Buffer Resource

Command Queue

Command Buffer

Texture Resource

Texture Resource

Texture Resource

Changeable Source Textures

Changeable Source Buffers

Page 82: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Command Buffer

Command Buffer

Data Buffer Resource

Data Buffer Resource

Texture Resource

Texture Resource

Texture Resource

Device

Render Command Encoder

Render Pipeline State

Depth Stencil State

Data Buffer Resource

Command Queue

Command Buffer

Texture Resource

Texture Resource

Texture Resource

Changeable Source Textures

Fixed Render Target Textures

Changeable Source Buffers

Page 83: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Render Command Encoder

Device

Render Command Encoder

Command Queue

Command Buffer

Page 84: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Compute Command Encoder

Device

Render Command Encoder

Command Queue

Command Buffer

Page 85: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Compute Command Encoder

Device

Render Command Encoder

Command Queue

Command Buffer

Thread #1

Render Command Encoder

Compute Command Encoder

Blit Command Encoder

Command Buffer

Thread #2

Page 86: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Command encoders convert API commands into hardware commands

Page 87: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Command encoders convert API commands into hardware commands

Hardware commands stored in command buffers

Page 88: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Command encoders convert API commands into hardware commands

Hardware commands stored in command buffers

Three types of command encoders • Render, Compute, Blit

• Can interleave different types into single command buffer

• Avoids implicit expensive state save and restore operations

Page 89: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Page 90: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Explicit command buffer construction and submission • App creates many lightweight command buffers

• App controls command buffer submission

• Metal signals app when command buffers finish execution

Page 91: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Explicit command buffer construction and submission • App creates many lightweight command buffers

• App controls command buffer submission

• Metal signals app when command buffers finish execution

Command encoders generate commands immediately • No deferred state validation

• Direct call to driver

Page 92: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Submission Model

Multithreaded command encoding • Multiple command buffers can be encoded in parallel

• App decides execution order

• Very efficient implementation to ensure scalable performance

Page 93: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Page 94: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Designed for A7’s unified memory system • CPU and GPU share same storage

• No implicit copies

• Automatic CPU/GPU coherency model

- CPU and GPU observe writes at command buffer execution boundaries

- No explicit CPU cache management required

Page 95: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Designed for A7’s unified memory system • CPU and GPU share same storage

• No implicit copies

• Automatic CPU/GPU coherency model

- CPU and GPU observe writes at command buffer execution boundaries

- No explicit CPU cache management required

Significantly higher performance • More synchronization responsibilities for you

Page 96: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Page 97: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Two types of resources • Textures (formatted images)

- Render targets, texture sources

• Data buffers (unformatted memory)

- “a bag of bytes”

- Vertex data, shader constants, output memory, etc.

Page 98: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Two types of resources • Textures (formatted images)

- Render targets, texture sources

• Data buffers (unformatted memory)

- “a bag of bytes”

- Vertex data, shader constants, output memory, etc.

Resource structure (size, levels, format) can’t be changed • Avoids costly resource validation

• Resource contents can be changed

Page 99: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Page 100: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Updating data buffers • Direct access to storage by CPU

• No “lock” API needed

Page 101: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Updating data buffers • Direct access to storage by CPU

• No “lock” API needed

Updating textures • Implementation private storage

• Metal provides blazing fast texture update routines

Page 102: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Updating data buffers • Direct access to storage by CPU

• No “lock” API needed

Updating textures • Implementation private storage

• Metal provides blazing fast texture update routines

GPU-accelerated and pipelined updates • Via Blit command encoder

Page 103: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Page 104: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Can share texture storage with other textures • Interpret as different pixel formats with same pixel size

- eg., sRGB vs. RGB, or single 32-bit component vs. RGBA8888

Page 105: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource Update Model

Can share texture storage with other textures • Interpret as different pixel formats with same pixel size

- eg., sRGB vs. RGB, or single 32-bit component vs. RGBA8888

Can share texture storage with other buffers • Assumes “row-linear” pixel data

Page 106: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Encoder Types

Page 107: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Encoder Types

Render command encoder • Graphics rendering

Page 108: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Encoder Types

Render command encoder • Graphics rendering

Compute command encoder • Data parallel computations

Page 109: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Command Encoder Types

Render command encoder • Graphics rendering

Compute command encoder • Data parallel computations

Blit command encoder • GPU-accelerated resource copy operations

Page 110: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Page 111: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Generates hardware commands for single rendering “pass” • All rendering to single framebuffer object

• Specifies states for vertex and fragment stages of 3D pipeline

• Interleaves resources, state changes, and draw calls

Page 112: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render Command Encoder

Generates hardware commands for single rendering “pass” • All rendering to single framebuffer object

• Specifies states for vertex and fragment stages of 3D pipeline

• Interleaves resources, state changes, and draw calls

No “draw time” compilation • App controls when all significant compilation and state validation occurs

Page 113: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render State Objects

State Object Description

DepthStencil DepthStencil comparison functions and write masks

Sampler Filter states, addressing modes, LOD state

Render Pipeline

“Everything else” Vertex and pixel shader functions

Vertex data layout Multisample state

Blend state Color write masks

Page 114: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Render States

States affecting compilation can’t be changed after object creation

Inexpensive states can be changed

Changeable Render States

Vertex and fragment shaders # of render targets, pixel format, color write masks

Multisample state Blend state

DepthStencil state and write masks

Specification of buffers, textures, samplers Cull mode and facing orientation

Depth clipping and depth bias Polygon mode

Viewport and scissor Occlusion queries

Page 115: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Framebuffer Loads and Stores

Framebuffer configuration designed for optimal behavior on A7 GPU • Tile-based deferred-mode renderer

Page 116: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Framebuffer Loads and Stores

Framebuffer configuration designed for optimal behavior on A7 GPU • Tile-based deferred-mode renderer

Explicit control of framebuffer tile cache “Load and Store” operations • Load at start of render pass

• Store at end of render pass

Page 117: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Framebuffer Loads and Stores

Framebuffer configuration designed for optimal behavior on A7 GPU • Tile-based deferred-mode renderer

Explicit control of framebuffer tile cache “Load and Store” operations • Load at start of render pass

• Store at end of render pass

App choses per render target • Load—Don’t care, load, clear

• Store—Don’t care, store, multisample resolve

Page 118: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before MetalFramebuffer loads and stores

One Frame

Read Write

Render Pass #2Render Pass #1Color

FramebufferColor

FramebufferColor

Framebuffer

Read Write

Page 119: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before MetalFramebuffer loads and stores

One Frame

Read Write

Render Pass #2Render Pass #1

Read WriteColor

FramebufferColor

FramebufferColor

Framebuffer

Page 120: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before MetalFramebuffer loads and stores

One Frame

Read Write

Render Pass #2Render Pass #1

Read Write

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write Read Write

Color Framebuffer

Color Framebuffer

Color Framebuffer

Page 121: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Before MetalFramebuffer loads and stores

One Frame

Read Write

Render Pass #2Render Pass #1

Read Write

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write Read Write

Color: 2 reads + 2 writesDepth: 2 reads + 2 writes

Framebuffer Memory Bandwidth

Color Framebuffer

Color Framebuffer

Color Framebuffer

Page 122: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Read WriteColor

Framebuffer

MetalFramebuffer loads and stores

One Frame

Read

Render Pass #2Render Pass #1

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write Read Write

Color Framebuffer

Color Framebuffer

Write

Page 123: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Read WriteColor

Framebuffer

MetalFramebuffer loads and stores

One Frame

Render Pass #2Render Pass #1

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write Read Write

Don’t Care Store

Color Framebuffer

Color Framebuffer

Write

Page 124: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Read WriteColor

Framebuffer

MetalFramebuffer loads and stores

One Frame

Render Pass #2Render Pass #1

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write Read Write

Don’t Care Store Load Store

Color Framebuffer

Color Framebuffer

Write

Page 125: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Read WriteColor

Framebuffer

MetalFramebuffer loads and stores

One Frame

Render Pass #2Render Pass #1

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Read Write

Don’t Care Store Load Store

Clear Don’t Care

Color Framebuffer

Color Framebuffer

Write

Page 126: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Read WriteColor

Framebuffer

MetalFramebuffer loads and stores

One Frame

Render Pass #2Render Pass #1

Depth Framebuffer

Depth Framebuffer

Depth Framebuffer

Don’t Care Store Load Store

Clear Don’t Care Clear Don’t

Care

Color Framebuffer

Color Framebuffer

Write

Page 127: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

ReadWrite

Dramatic Reduction in Memory Traffic

One Frame

Render Pass #2Render Pass #1

WriteDon’t Care Store Load Store

Clear Don’t Care Clear Don’t

Care

Color: 1 reads + 2 writesDepth: 0 reads + 0 writes

Framebuffer Memory Bandwidth

Color Framebuffer

Depth Framebuffer

Color Framebuffer

Color Framebuffer

Depth Framebuffer

Depth Framebuffer

Page 128: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Compute Command Encoder

Familiar compute run-time and memory model • Textures and data buffers

• Local and global memory

• Local atomics

• Barriers

• Memory loads and stores

• User-specified workgroup dimensions

Page 129: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Compute Command Encoder

Page 130: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Compute Command Encoder

Fully integrated with graphics • Unified API, shading language and developer tools

• Efficiently interleaves Compute commands with Render and Blit commands

Page 131: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Compute Command Encoder

Fully integrated with graphics • Unified API, shading language and developer tools

• Efficiently interleaves Compute commands with Render and Blit commands

No “execution-time” compilation • App controls when all significant compilation and validation occurs

Page 132: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Compute Command Encoder

State Object Description

Compute State Compute functions, workgroup configuration

Sampler Filter states, addressing modes, LOD state

Page 133: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Blit Command Encoder

Enables asynchronous copies • In parallel with graphics and compute operations

Page 134: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Blit Command Encoder

Enables asynchronous copies • In parallel with graphics and compute operations

Texture uploads • Copy to/from other texture or data buffer

• Accelerated mipmap generation

Page 135: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Blit Command Encoder

Enables asynchronous copies • In parallel with graphics and compute operations

Texture uploads • Copy to/from other texture or data buffer

• Accelerated mipmap generation

Data Buffer updates • Copy to/from another data buffer or texture

• Fill with constant values

Page 136: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Agenda

Background

API concepts

Shading language

Developer tools

Page 137: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Page 138: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Unified shading language for graphics and compute processing

Page 139: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Unified shading language for graphics and compute processing

Based on C++11 • Static subset

• Built from LLVM and clang

Page 140: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Unified shading language for graphics and compute processing

Based on C++11 • Static subset

• Built from LLVM and clang

Additions • GPU hardware features (texture sampling, rasterization, compute operations, etc.)

• Function overloading and templates

Page 141: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Page 142: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Data types for graphics and compute features • Scalar, vector and matrix types

• Samplers and textures

Page 143: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Data types for graphics and compute features • Scalar, vector and matrix types

• Samplers and textures

“Attributes” • Function arguments

• Sampling and interpolation qualifiers

• Per-instance inputs, outputs, and built-in graphics variables

• Programmable blending

Page 144: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Page 145: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Multiple shaders per source file

Page 146: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Multiple shaders per source file

Metal shaders built by Xcode compiler into Metal library files • Library contains archive of Metal shaders

• With run-time APIs

- Load a Metal library

- Finalize compilation to GPU machine code

Page 147: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shading Language

Multiple shaders per source file

Metal shaders built by Xcode compiler into Metal library files • Library contains archive of Metal shaders

• With run-time APIs

- Load a Metal library

- Finalize compilation to GPU machine code

Metal includes standard library for graphics and compute functions

Page 148: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Textures, buffers, and samplers passed as arguments to functions • Vertex, fragment, compute shaders

Page 149: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Textures, buffers, and samplers passed as arguments to functions • Vertex, fragment, compute shaders

Each command encoder includes set of “argument tables” • One table per type (texture, buffer, sampler)

Page 150: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Textures, buffers, and samplers passed as arguments to functions • Vertex, fragment, compute shaders

Each command encoder includes set of “argument tables” • One table per type (texture, buffer, sampler)

Metal API and shading language use table index to reference arguments

Page 151: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command Encoder

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Page 152: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command Encoder

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Buffer Index ID

0 Buffer A

1 Buffer B

.. Buffer ..

n Buffer Z

Sampler Index ID

0 Sampler A

1 Sampler B

.. Sampler ..

n Sampler Z

Page 153: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command EncoderShader Code

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Buffer Index ID

0 Buffer A

1 Buffer B

.. Buffer ..

n Buffer Z

Sampler Index ID

0 Sampler A

1 Sampler B

.. Sampler ..

n Sampler Z

vertex VertexOutput smoothTriangleVertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; } !

fragment float4 smoothTriangleFragment(VertexOutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); }

Page 154: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command EncoderShader Code

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Buffer Index ID

0 Buffer A

1 Buffer B

.. Buffer ..

n Buffer Z

Sampler Index ID

0 Sampler A

1 Sampler B

.. Sampler ..

n Sampler Z

vertex VertexOutput smoothTriangleVertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; } !

fragment float4 smoothTriangleFragment(VertexOutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); }

Page 155: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command EncoderShader Code

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Buffer Index ID

0 Buffer A

1 Buffer B

.. Buffer ..

n Buffer Z

Sampler Index ID

0 Sampler A

1 Sampler B

.. Sampler ..

n Sampler Z

vertex VertexOutput smoothTriangleVertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; } !

fragment float4 smoothTriangleFragment(VertexOutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); }

Page 156: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Argument Tables

Command EncoderShader Code

Texture Index ID

0 Texture A

1 Texture B

.. Texture ..

n Texture Z

Buffer Index ID

0 Buffer A

1 Buffer B

.. Buffer ..

n Buffer Z

Sampler Index ID

0 Sampler A

1 Sampler B

.. Sampler ..

n Sampler Z

vertex VertexOutput smoothTriangleVertex(constant float4 *pos_data [[ buffer(0) ]], constant float2 *uv_data [[ buffer(1) ]], uint vid [[ vertex_id ]]) { VertexOutput out; out.pos = pos_data[vid]; out.uv = uv_data[vid]; return out; } !

fragment float4 smoothTriangleFragment(VertexOutput in [[ stage_in ]], texture2d<float> tex [[ texture(1) ]]) { return tex.sample(s, in.uv); }

Page 157: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Agenda

Background

API concepts

Shading language

Developer tools

Page 158: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Shader Compiler Process

Page 159: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Shader Compiler Process

Metal shader sources compiled to libraries at application build time • No need to ship source code with application

• Shading language errors reported at build time

Page 160: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Shader Compiler Process

Metal shader sources compiled to libraries at application build time • No need to ship source code with application

• Shading language errors reported at build time

Metal libraries compiled to device code at state object creation time • No draw time compilation

• Device code cached after compilation

Page 161: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Shader Compiler Process

Metal shader sources compiled to libraries at application build time • No need to ship source code with application

• Shading language errors reported at build time

Metal libraries compiled to device code at state object creation time • No draw time compilation

• Device code cached after compilation

There is also a run-time shader compiler • No draw time compilation

• For best performance, use offline compiler

Page 162: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler Process

Metal Shader Compiler

In Xcode

Page 163: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler Process

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Page 164: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler Process

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 165: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler Process

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 166: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 167: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Pipeline Object Creation

State Vertex Shader

Fragment Shader

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 168: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Pipeline Object Creation

State Vertex Shader

Fragment Shader

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 169: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Pipeline Object Creation

State Vertex Shader

Fragment Shader

Shader Cache

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 170: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Pipeline Object Creation

State Vertex Shader

Fragment Shader

Metal Device CompilerShader Cache

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 171: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Your Application

Metal Shader Compiler ProcessAt application run-time

Pipeline Object Creation

State Vertex Shader

Fragment Shader

Metal Device CompilerShader Cache

GPU

Metal ShaderMetal ShaderMetal ShaderMetal ShaderMetal Shader

Metal Shader Compiler

In Xcode

Metal Library

Page 172: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Metal Tools Fully Integrated into Xcode

Visual frame debugger

API trace and navigation

Shader edit-and-continue

Rich source code editing (including shaders)

Graphics and compute state inspection

Shader compiler

Debug mode for Metal framework

Page 173: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled
Page 174: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Frame Navigator

Page 175: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Framebuffer View

Page 176: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Resource View

Page 177: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

State Inspector

Page 178: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled
Page 179: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Performance Report

Page 180: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled
Page 181: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shader Profiler

Page 182: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shader Profiler

Page 183: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shader Profiler

Page 184: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shader Profiler

Page 185: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Shader Profiler

Page 186: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled
Page 187: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Demo“The Collectables” on Metal

Sean Tracey Crytek

Page 188: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Page 189: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Page 190: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Page 191: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Page 192: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Streamlined for modern GPU features

Page 193: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Streamlined for modern GPU features

Fine-grained control

Page 194: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Streamlined for modern GPU features

Fine-grained control

Precompiled shaders

Page 195: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Streamlined for modern GPU features

Fine-grained control

Precompiled shaders

Fantastic developer tools

Page 196: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Summary

Low overhead, high performance GPU programming API

Up to 10x more draw calls

Designed for A7 and iOS

Streamlined for modern GPU features

Fine-grained control

Precompiled shaders

Fantastic developer tools

Enables entirely new class of games

Page 197: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

More Information

Filip Iliescu Graphics and Games Technologies Evangelist [email protected]

Allan Schaffer Graphics and Games Technologies Evangelist [email protected]

!

Documentation http://developer.apple.com

Apple Developer Forums http://devforums.apple.com

Page 198: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Related Sessions

• Working with Metal—Fundamentals Pacific Heights Wednesday 10:15AM

• Working with Metal—Advanced Pacific Heights Wednesday 11:30AM

Page 199: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled

Labs

• Metal Lab Graphics and Games Lab A Wednesday 2:00PM

• Metal Lab Graphics and Games Lab B Thursday 10:15AM

Page 200: Working with Metal—Overview - Apple Inc. · Session 603 Jeremy Sandmel GPU Software Graphics and Games. Metal Dramatically reduced overhead Unified graphics and compute Precompiled