[shaderx6] 3.7 robust order-independent transparency via reverse depth peeling in directx 10

3.7 Robust Order-Independent Transparency via Reverse Depth Peeling in DirectX 10

ohyecloudy http://ohyecloudy.com

shader studyhttp://cafe.naver.com/shader.cafe

2010.06.21

ShaderX6

http://ohyecloudy.com/

http://cafe.naver.com/shader.cafe

Introduction

Depth Peeling

Reverse Depth Peeling

Overview

Algorithm

Emulating Second Depth Buffer

Optimal # of Layers

Optimizations

Conclusion

Order-Independent Transparency

반투명 지오메트리를 정렬 없이 편하게 그리자

왜 고생하고 있나?

z-buffer는 fragment마다

entry 하나만 가지도록 설계됐음.

back-to-front order

전통적인 방식 카메라 공간에서 먼 지오메트리부터 가까운 지오메트리 순서로 정렬한다. 정렬하는 비용

공짜는 없다.

보통 CPU에서 bitonic 정렬같은 방법으로 GPU에서 할 수도 있음

back-to-front order 문제점

정렬이 per-object나 per-polygon 단위

per-pixel이 아니라서 visual artfact 존재

정렬한 순서대로 렌더링을 하기 때문에

모아 찍기가 불가능

shader switching이 많다.

per-object sorting

reverse depth peeling

Introduction

Depth Peeling


Overview

Algorithm


Optimal # of Layers

Optimizations

Conclusion

깊이depth를 귤 껍질 까듯이 하나하나 벗겨peeling

layer에 그린다.

from google 사전

Layer Layer Layer Layer Layer Layer

반투명 폴리곤 반투명 폴리곤 반투명 폴리곤 반투명 폴리곤 반투명 폴리곤

정렬 안 하고 렌더링

Render Target

back-to-front order blend

사용하는 layer 수 만큼 layer 추출

per-pixel 반투명 평가 가능

visual artifact X

layer가 deferred shading에 있는

G-buffer와 닮았다.

layer는 video memory에 저장

몇 개야! 도대체

압박

Introduction

Depth Peeling


Overview

Algorithm


Optimal # of Layers

Optimizations

Conclusion

depth peeling 메모리 사용량을 줄이자

layer 하나만 사용 하나를 계속 업데이트해서 쓴다는 얘기.

layer가 하나란 개념이 아니다.

layer 추출하는 순서를 바꿨다.

depth peeling front-to-back order로 layer를 추출

reverse depth peeling back-to-front order로 layer를 추출

반투명 폴리곤 반투명 폴리곤 반투명 폴리곤 반투명 폴리곤 반투명 폴리곤

정렬 안 하고 렌더링

Layer Render Target

blend

layer 추출

사용하는 layer 수 만큼

layer = 1 (furthermost)

Render Target

layer = 2

Render Target

layer = 3

Render Target

layer = 4 (frontmost)

Render Target

Introduction

Depth Peeling

Reverse Depth Peeling Overview

Algorithm


Optimal # of Layers

Optimizations

Conclusion

for (nLayer=0; nLayer<nRequiredLayers; ++nLayer)

{

BindDepthBuffer(0, pDepthBuffer[0], EnableWrites, GREATER);

Clear(pDepthBuffer[0], 0.0);

BindDepthBuffer(1, pDepthBuffer[1], DisableWrites, LESS);

SetRenderTarget(pCurrentTransparentLayer);

SetBlendMode(ONE, ZERO);

DrawTransparentGeometry();

SetTexture(pCurrentTransparentLayer);

SetRenderTarget(pMainRenderTarget);

SetBlendMode(SRCALPHA, INVSRCALPHA);

DrawFullscreenQuad();

SWAP(pDepthBuffer[0], pDepthBuffer[1]);

}


{












}

가장 멀리 있는 반투명 fragment를 판단하기 위한 depth buffer Z 값을 write. GREATER 비교 GREATER로 비교하기 때문에 0.0으로 전체를 지운다.


{












}

이전 layer에서 벗겨낸 지오메트리를 또 다시 안 벗겨내기 위해서 back – to – front 순서로 벗겨내고 있는 것을 명심 LESS 비교 제외시키기 위한 용도이므로 z값을 write하지 않는다.

이번에 기록한 가장 먼 depth를 다음 루프에서 LESS 비교 값으로 사용하기 위해


{












}

Layer에 반투명 지오메트리를 렌더링한다. 렌더 타겟에 blend 하기 위한 임시 렌더링


{












}

메인 렌더 타겟에 blend한다.

Introduction

Depth Peeling


Algorithm


Optimal # of Layers

Optimizations

Conclusion


{




...

}

잠깐! DepthBuffer 0,1번 인덱스에 바인딩? 이런 게 있나?

있으면 좋겠지만 그런 거 없다

단지 Pseudo-code일뿐

두 번째 테스트

depth 값을 비교해서 버림

쓰는 작업이 없기 때문에 구현이 간단

struct PS_INPUT

{

float4 vPosition : SV_POSITION;

float2 vTex : TEXCOORD0;

};

Texture2D txInputDepth;

float4 PSRenderObjects(PS_INPUT intput) : SV_TARGET

{

// Fetch depth value from 2nd depth buffer

float fDepth =

txInputDepth.Load(int3(input.vPosition.xy, 0));

// Discard fragment if LESS depth test failes

float f = (fDepth <= input.vPosition.z);

flip(-f);

// calculate color and alpha etc

...

}

Introduction

Depth Peeling


Algorithm


Optimal # of Layers

Optimizations

Conclusion

Layer를 몇 개 쓰면 될까?

간단한 답 depth complexity로 layer 개수를 정한다.

DirectX9::GetDepthComplexity() 현재 장면 깊이 복잡도를 구하는 함수

이런 게 있으면 얼마나 좋을까?

있을 리가 없다.

Layer 개수를 정해서 사용 어느 정도 visual error 감수

좀 더 나은 방법이 없을까?

Occlusion Queries

pixel이 depth test를 통과했는지 못했는지

알 수 있다.

ID3D10Query::Begin() ~ ID3D10Query::End()

ID3D10Query::GetData()

• depth 테스트를 통과한 pixel 개수를 알 수 있다.

Dynamic하게 layer 개수를 조정할 수 있겠다.

Occlusion Queries

픽셀이 안 남을 때까지 peeling

원칙적으로는 맞다

성능을 높이려면 threshold를 둬서 그만 둠

데모에선 threshold 값으로 0.01% 사용

Introduction

Depth Peeling


Algorithm


Optimal # of Layers

Optimizations

Conclusion

Transform

반투명 지오메트리를 layer마다

stream-out

Direct3D 10

트랜스폼된 지오메트리를 버퍼에 저장

다시 사용할 수 있다

결국 트랜스폼은 한 번만 해서 여러번 사용

Fill-Rate

dynamic branch를 사용

float fDepth = txInputDepth.Load(int3(input.vPosition.xy, 0));

if (input.vPosition.z < fDepth)

{

// Depth test passes

// calculate color and alpha etc..

}

else

{

// Emulated depth test fails. kill fragment

discard;

}

Introduction

Depth Peeling


Overview

Algorithm


Optimal # of Layers

Optimizations

Conclusion

depth peeling, reverse depth peeling 개념 정리

DirectX 10은 되야지 쓸만할 것 같다.

직접 써봐야지 평가할 수 있을 것 같음

아직까짂 가장 현실적인 해결책은

전통적인 방법으로 sorting

품질을 좀 양보하면 Alpha to coverage

[shaderx6] 3.7 robust order-independent transparency via reverse depth peeling in directx 10

Technology

depth bufferoptimal

depth peelingfront

object sortingreverse

layer video memory

furthermostrender target

frontmostrender target

comshader studyhttp

pixel visual artifact