practical use of screen space ambient occlusion michał drobot visual technical director reality...

Post on 13-Dec-2015

239 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Practical use of Practical use of Screen Space Screen Space

Ambient OcclusionAmbient Occlusion

Michał DrobotMichał DrobotVisual Technical DirectorVisual Technical Director

Reality PumpReality Pump

Talk OutlineTalk Outline

• MotivationMotivation• TheoryTheory• General IdeaGeneral Idea• Practical Solution in DetailsPractical Solution in Details

• Screen Space to Camera SpaceScreen Space to Camera Space• RaycastingRaycasting• Blocker TestBlocker Test• FilteringFiltering• Compositing with HDRiCompositing with HDRi

Talk OutlineTalk Outline

• Tackling ProblemsTackling Problems• Deffered RenderersDeffered Renderers• Forward RenderersForward Renderers

• Performance & IQ OptimizationsPerformance & IQ Optimizations• Further DevelopmentFurther Development• SummarySummary• Questions?Questions?

MotivationMotivation

• Multiplatform Multiplatform • HQ seamless land/city-scape HQ seamless land/city-scape

renderingrendering• Realistic lighting model:Realistic lighting model:

• Dynamic day cycleDynamic day cycle• Dynamic lightsDynamic lights• Dynamic environmentsDynamic environments• Global Illumination Approximation - taking Global Illumination Approximation - taking

into consideration all above = PROBLEMATIC!into consideration all above = PROBLEMATIC!

MotivationMotivation

MotivationMotivation

MotivationMotivation

MotivationMotivation

MotivationMotivation

MotivationMotivation

• Solutions:Solutions:• PRT – mapsPRT – maps• Per Vertex SHPer Vertex SH• Point CloudsPoint Clouds• Inhouse off-line solutions…Inhouse off-line solutions…

• All failed:All failed:• Memory consumption (X360)Memory consumption (X360)• PerformancePerformance• FlexibilityFlexibility

MotivationMotivation

• Needed new solutionNeeded new solution• DynamicDynamic• ScalableScalable• Max Performance/Memory ratioMax Performance/Memory ratio

• Screen Space SolutionScreen Space Solution• Image Depth enhacement publicationsImage Depth enhacement publications• Hybrid rendering engineHybrid rendering engine• Z-based post processingZ-based post processing

MotivationMotivation

MotivationMotivation

TheoryTheory

• Ambient Occlusion – TechnicallyAmbient Occlusion – Technically• Global Illumination approximationGlobal Illumination approximation• Omnidirectional lightingOmnidirectional lighting• Accessiblity shadingAccessiblity shading• Film Industry proven prepass SkylightFilm Industry proven prepass Skylight

TheoryTheory

• AO – visual appealAO – visual appeal• Artists using Skylight bake in texturesArtists using Skylight bake in textures• Enviromental effect – surface reaction to:Enviromental effect – surface reaction to:

• Dirt/weatheringDirt/weathering• LightLight

• Prepass for Global IlluminationPrepass for Global Illumination• Enhances scene by depth, curvature, Enhances scene by depth, curvature,

spatial proximity cluesspatial proximity clues

TheoryTheory

• Integral of blocker function over the Integral of blocker function over the hemispherehemisphere

TheoryTheory

• Point A is not occludedPoint A is not occluded• Point B is darkened (high geo. Proximity)Point B is darkened (high geo. Proximity)

General IdeaGeneral Idea

General IdeaGeneral Idea

• Screen Space = simplified calculation Screen Space = simplified calculation domain (performance)domain (performance)

• Depth bufferDepth buffer• Resembles scene in SSResembles scene in SS• Generally availableGenerally available• Preferably linear - high precision (>FP16)Preferably linear - high precision (>FP16)

• Pixel (x,y) in SS & DB val = (x,y,z) in Pixel (x,y) in SS & DB val = (x,y,z) in CSCS

• Camera Space = simplified Camera Space = simplified calculation domain (complexity)calculation domain (complexity)

TheoryTheory

• For each pixel in SSFor each pixel in SS• For 1 to N samplesFor 1 to N samples

• Transform to CS domainTransform to CS domain• Raycast in CS in random direction on point’s Raycast in CS in random direction on point’s

normal hemispherenormal hemisphere• Gather irradiance information from surface Gather irradiance information from surface

hit checkhit check• Cumulate results considering atmosphere Cumulate results considering atmosphere

attenuation, radiance energyattenuation, radiance energy

• Return resulting irradianceReturn resulting irradiance

• Composite final SS result layer with Composite final SS result layer with HDRiHDRi

Practical Solution in Practical Solution in DetailsDetails

Screen Space to Camera Screen Space to Camera SpaceSpace• SS pixel (x,y) & depth value = (x,y,z) SS pixel (x,y) & depth value = (x,y,z)

in CSin CS• Min 1 mul = fastMin 1 mul = fast• Keep your depth high precision to Keep your depth high precision to

avoid artifactsavoid artifacts• Preferable 32FP linear depth Preferable 32FP linear depth

(depends on your scene depth range)(depends on your scene depth range)• Our solution is based on 16FP linear Our solution is based on 16FP linear

depth transformed from original log depth transformed from original log Depth Buffer (gives filterable Depth Buffer (gives filterable artifacts)artifacts)

RaycastingRaycasting

• Raycast by raymarch Raycast by raymarch • Generate normalized hemisphere of Generate normalized hemisphere of

randomly distributed points with randomly distributed points with uniform distribution ie. Poissonuniform distribution ie. Poisson

• Rotate hemisphere so its’ horizon is Rotate hemisphere so its’ horizon is perpendicular to point’s normal (for perpendicular to point’s normal (for now we assume to have it)now we assume to have it)

• Scale the hemisphere for desired Scale the hemisphere for desired effect rangeeffect range

• Offset CS original point by generated Offset CS original point by generated pointspoints

RaycastRaycast

Transform new point from CS to SSTransform new point from CS to SS• Acquire points z value from our scene Acquire points z value from our scene

in SS in SS • We can also acquire additional info We can also acquire additional info

like diffuse color, radiance energy etc.like diffuse color, radiance energy etc.

Blocker TestBlocker Test

• In the same domain we’ve gotIn the same domain we’ve got• Depth of ‘possible’ hit (pd)Depth of ‘possible’ hit (pd)• Real depth at (x,y) of ‘possible’ hit point Real depth at (x,y) of ‘possible’ hit point

(rd)(rd)

• We want to check if we hit something We want to check if we hit something with our raywith our ray

• If something exists in our depth buffer If something exists in our depth buffer at SS cordinates of ‘pd’ and at the at SS cordinates of ‘pd’ and at the same depth we’ve got a HITsame depth we’ve got a HIT

Blocker TestBlocker Test

• We need a comparision function We need a comparision function considering attenuation, limited considering attenuation, limited samples, possible precision errorssamples, possible precision errors• Real Distance Proximity EvaluationReal Distance Proximity Evaluation• Depth Difference EvaluationDepth Difference Evaluation

Blocker Test (RDPE)Blocker Test (RDPE)

• Real Distance Proximity Evaluation Real Distance Proximity Evaluation (RDPE)(RDPE)• Abs(pd-rd) < EAbs(pd-rd) < E

• If TRUE = HITIf TRUE = HIT• Accumulate results - Occlusion += 1/(1+Accumulate results - Occlusion += 1/(1+

(length(ray)*scale)^2)(length(ray)*scale)^2)

• If FALSE = missIf FALSE = miss• Discard results – SamplesNo--;Discard results – SamplesNo--;

• Alternativly for high accuracy repeat whole Alternativly for high accuracy repeat whole process of raycasting using linear/binary searchprocess of raycasting using linear/binary search

• finalOcclusion = Occlusion/SamplesNofinalOcclusion = Occlusion/SamplesNo

Blocker Test (RDPE)Blocker Test (RDPE)

• ProsPros• conceptualy precise resultsconceptualy precise results• Superb quality with linear/binary Superb quality with linear/binary

raymarching (matching offline renderers)raymarching (matching offline renderers)• Ideal for GIIdeal for GI• Global (full screen) solutionGlobal (full screen) solution

• ConsCons• Discarding MISSes leads to Discarding MISSes leads to

undersampling, requiring more sample undersampling, requiring more sample with good distributionwith good distribution

• Requires ray length calculationsRequires ray length calculations• Insane sampling rate with raymarching Insane sampling rate with raymarching

solution: SampleNo * solution: SampleNo * PerSampleMissSamplingPerSampleMissSampling

Blocker Test (RDPE)Blocker Test (RDPE)

• ScaleabilityScaleability• E – Epsilon / Sample No RatioE – Epsilon / Sample No Ratio• Iteration/E limit for raymarchingIteration/E limit for raymarching• Still Raymarched solution needs at least Still Raymarched solution needs at least

16 subsamples on avg (using flow 16 subsamples on avg (using flow control) control)

• Practically useless due to Practically useless due to undersampling artifacts while using undersampling artifacts while using sane number of samplessane number of samples

Blocker Test (DDE)Blocker Test (DDE)

• Depth Difference Evaluation (DDE)Depth Difference Evaluation (DDE)• Use all depth difference data with Use all depth difference data with

atennuation functionatennuation function

• Occlusion = 1.0/(1.0+ (pd – rd)^2)Occlusion = 1.0/(1.0+ (pd – rd)^2)• FinalOcclusion = Occlusion/SamplesNoFinalOcclusion = Occlusion/SamplesNo

Blocker Test (DDE)Blocker Test (DDE)

• ProsPros• FastFast• SimpleSimple• Perceptually accuratePerceptually accurate

• ConsCons• Lacks accuracyLacks accuracy• Gives more local resultsGives more local results• Generally too local for GIGenerally too local for GI

Blocker Test (DDE)Blocker Test (DDE)

• ScaleabilityScaleability• Sample rate tweakingSample rate tweaking• Scale samples with Z – depth (flow Scale samples with Z – depth (flow

control)control)

• Practically DDE is a WINPractically DDE is a WIN• But… only for this generation…But… only for this generation…

FilteringFiltering

• Acquired Occlusion layer due to Acquired Occlusion layer due to undersampling artifacts should be undersampling artifacts should be filteredfiltered

• Due to AO low-frequency nature Due to AO low-frequency nature denoise blurring filters are the denoise blurring filters are the weapon of choiceweapon of choice

• Should smooth out results, denoise Should smooth out results, denoise the signal and retain the detailsthe signal and retain the details

FilteringFiltering

FilteringFiltering

• Gaussian FilterGaussian Filter• Weighted average of sampled valuesWeighted average of sampled values• Requires preprocessed kernel weightsRequires preprocessed kernel weights• Kernel of NxN pixelsKernel of NxN pixels

FilteringFiltering

• Gaussian FilterGaussian Filter• 2 pass filtering with interpolator usage2 pass filtering with interpolator usage• Very fast O(N)Very fast O(N)• Good smooth and denoise abilityGood smooth and denoise ability• Removes details, blurs edgesRemoves details, blurs edges• AO bleeding occursAO bleeding occurs• Introduces new artifacts (halos, bleeding)Introduces new artifacts (halos, bleeding)

FilteringFiltering

Gaussian 18x18 Separable18 taps

FilteringFiltering

• Median filterMedian filter• Median of sampled valuesMedian of sampled values• Kernel of NxN pixelsKernel of NxN pixels• Heavy use of vector arithmetics for value Heavy use of vector arithmetics for value

sortingsorting

FilteringFiltering

• Median filterMedian filter• 1 pass artifact free filter is O(N^2)1 pass artifact free filter is O(N^2)

• That’s too slow for reasonable kernel sizeThat’s too slow for reasonable kernel size• Possible use of combined multipass filters of Possible use of combined multipass filters of

small sizesmall size• IQ better than gaussian, but too slowIQ better than gaussian, but too slow

• 2 pass (V/H) is only O(2*N)2 pass (V/H) is only O(2*N)• Fast enoughFast enough• Not correctNot correct• May Introduces block artifactsMay Introduces block artifacts• Usable as multipass/varied kernel size filterUsable as multipass/varied kernel size filter

FilteringFiltering

Median 18x18 Separable18 taps

FilteringFiltering

Median 2x(6x6) Non Separable18 taps

FilteringFiltering

• Bilateral filteringBilateral filtering• 3D filter3D filter• Filtering in domain of spatial proximity Filtering in domain of spatial proximity

and intensityand intensity• Possible fast implementation using Possible fast implementation using

vector artihmeticsvector artihmetics

BF[I]p = (1/Wp)Sum[q e S]( G[d](||p-q||)*G[i](|Ip-Iq)*Iq)BF[I]p = (1/Wp)Sum[q e S]( G[d](||p-q||)*G[i](|Ip-Iq)*Iq)

FilteringFiltering

• We can simplify BF using same We can simplify BF using same Gaussian for proximity and intensity Gaussian for proximity and intensity domainsdomains

• For each pixel pFor each pixel p• Normalize Sum ofNormalize Sum of

• For each pixel q e SFor each pixel q e S• Gaussian(||p-q||) * Gaussian(|Ip-Iq|) * IqGaussian(||p-q||) * Gaussian(|Ip-Iq|) * Iq

FilteringFiltering

• Bilateral filter is NOT separableBilateral filter is NOT separable• But results of separable filtering are But results of separable filtering are

acceptableacceptable

• We can use linear filtering for bigger We can use linear filtering for bigger kernelskernels

• Optimized BF is O(n) and as fast as Optimized BF is O(n) and as fast as gaussian when done rightgaussian when done right

FilteringFiltering

Bilateral 18x18 Separable18 taps

FilteringFiltering

Bilateral 9x9 Separable18 taps

FilteringFiltering

• Bilateral FilterBilateral Filter• Gives very good smoothingGives very good smoothing• Retains detailsRetains details• Doesn’t affect edgesDoesn’t affect edges• Prevents bleedingPrevents bleeding• Is fastIs fast• Is our choiceIs our choice

FilteringFilteringGaussian 18x18 Bilateral 18x18

Bilateral 9x9Median 9x9 Median 2x(6x6)

Original

Compositing with HDRiCompositing with HDRi

• Forward renderingForward rendering• Compose as post process by multiply Compose as post process by multiply

with HDRi before tone mappingwith HDRi before tone mapping• Exposure will need adjustementExposure will need adjustement

• Post HDR – AP soft light layer processing Post HDR – AP soft light layer processing (or other chosen by artists)(or other chosen by artists)

• Deffered renderingDeffered rendering• Do it firstDo it first• Use it as ambient occlusion term during Use it as ambient occlusion term during

material shading and lighting passmaterial shading and lighting pass

Tackling ProblemsTackling Problems

Tackling ProblemsTackling Problems

• Deffered RenderingDeffered Rendering• easy access to WS normal layereasy access to WS normal layer• Normal steps as stated beforeNormal steps as stated before• Gives highest quality and stabilityGives highest quality and stability

• Optimization of hemisphere generationOptimization of hemisphere generation• Generate sphere of pointsGenerate sphere of points• Mirror all along plane perpendicular to pixel’s Mirror all along plane perpendicular to pixel’s

normal (transformed to CS)normal (transformed to CS)

Tackling ProblemsTackling Problems

• Forward RenderingForward Rendering• We have NO acces to normal layerWe have NO acces to normal layer• We CANNOT generate hemisphere of We CANNOT generate hemisphere of

pointpoint• Our only solution is to generate a sphereOur only solution is to generate a sphere

Tackling ProblemsTackling Problems

• FR Sphere sampling problemsFR Sphere sampling problems• RDPERDPE

• Raytracing ‘inside’ surfaceRaytracing ‘inside’ surface• Performance loss / undersampling in RDPEPerformance loss / undersampling in RDPE

• DDEDDE• Can’t find backface raysCan’t find backface rays• Backface rays return false resultsBackface rays return false results• Lack of stability due to non uniform Lack of stability due to non uniform

front/backface sampling ratefront/backface sampling rate• Are we doomed?Are we doomed?

Tackling ProblemsTackling Problems

Tackling ProblemsTackling Problems

• FR DDE continued…FR DDE continued…• Naive solution – acceptable resultsNaive solution – acceptable results• Discard all results for rays ‘behind’ point Discard all results for rays ‘behind’ point

we are shadingwe are shading• ProsPros

• Gives acceptavble resultsGives acceptavble results

• Cons Cons • Undersampling due to 50% loss of data on averageUndersampling due to 50% loss of data on average• Not accurate because we’re discarding ‘good’ Not accurate because we’re discarding ‘good’

pointspoints• Lacks stability = introduces view dependancy Lacks stability = introduces view dependancy

because of non uniform no of samples at higher because of non uniform no of samples at higher anglesangles

Tackling ProblemsTackling Problems

• FR DDE – Real SolutionFR DDE – Real Solution• ObservationObservation

• In CS when (pd-rd)<0 (checked depth was In CS when (pd-rd)<0 (checked depth was ‘behind’ ray) there’s no occluder affecting ‘behind’ ray) there’s no occluder affecting current pixelcurrent pixel

• We can use it to change our blocker functionWe can use it to change our blocker function

• New blocker functionNew blocker function• if (pd-rd)<0if (pd-rd)<0

• TRUE – TRUE – Discard resultsDiscard results – SamplesNo– SamplesNo--;--;• FALSE – HIT - Occlusion += 1.0/(1.0+abs(pd – FALSE – HIT - Occlusion += 1.0/(1.0+abs(pd –

rd)^2)rd)^2)

• finalOcclusion = Occlusion/SamplesNofinalOcclusion = Occlusion/SamplesNo

Tackling ProblemsTackling Problems

• FR DDE – Real SolutionFR DDE – Real Solution• New blocker functionNew blocker function

• Gives very good accuracyGives very good accuracy• Undersampling occurs but is greatly Undersampling occurs but is greatly

minimized on averageminimized on average• Stil high viewdependancy due to Stil high viewdependancy due to

nonuniformity of samplesnonuniformity of samples

• We want to guarantee that each pixel at We want to guarantee that each pixel at each angle will be shaded with same each angle will be shaded with same amount of samplesamount of samples

Tackling ProblemsTackling Problems

Tackling ProblemsTackling Problems

• FR DDE – Real SolutionFR DDE – Real Solution• GEOSPHERE to the rescueGEOSPHERE to the rescue• Instead of randomly generated sphere of Instead of randomly generated sphere of

ray points we use randomly rotated ray points we use randomly rotated points lying on GEOSPHEREpoints lying on GEOSPHERE

• Geosphere guarantees that plane slicing Geosphere guarantees that plane slicing it will always divide it’s point number in it will always divide it’s point number in halfhalf

• Guarantee of uniform samples solves Guarantee of uniform samples solves view dependancy problemsview dependancy problems

• For better accuracy we can generate For better accuracy we can generate geosphere in geosphere point cloudsgeosphere in geosphere point clouds

Tackling ProblemsTackling Problems

Tackling ProblemsTackling Problems

Tackling ProblemsTackling Problems

• FR DDE – Real SolutionFR DDE – Real Solution• SummarySummary

• Use enhanced blocker functionUse enhanced blocker function• Use Geosphere for ray generationUse Geosphere for ray generation

Performance & IQ Performance & IQ OptimizationsOptimizations• Use LO-res buffers for SSAO renderingUse LO-res buffers for SSAO rendering

• Efficient raycastingEfficient raycasting

• Efficient filteringEfficient filtering

• Bilateral filtering / Bilinear upsampling comboBilateral filtering / Bilinear upsampling combo

Performance & IQ Performance & IQ OptimizationsOptimizations• Use jitteringUse jittering

• Use dependant depth reads instead of high Use dependant depth reads instead of high sampling ratesampling rate

• Precompute Your Point SpherePrecompute Your Point Sphere• GeocubeGeocube

• Poisson distributionPoisson distribution

• And rotate it by random vector every pixelAnd rotate it by random vector every pixel• Reflect(SpherePos[i],RandomVector)Reflect(SpherePos[i],RandomVector)

Performance & IQ Performance & IQ OptimizationsOptimizations• Acquire random vectorsAcquire random vectors

• Use pregenerated noise texturesUse pregenerated noise textures

• Map them 1-1 pixel ratio in Screen SpaceMap them 1-1 pixel ratio in Screen Space

• Scale sample number to your needsScale sample number to your needs

• Scale search radius and sample number Scale search radius and sample number according to depthaccording to depth

• Use as fast image format as you can for Use as fast image format as you can for depth sampling depending on hardwaredepth sampling depending on hardware• Pure SSAO (FP16, FP32)Pure SSAO (FP16, FP32)

• Irradiance transfer (RGBA16)Irradiance transfer (RGBA16)

Performance & IQ Performance & IQ OptimizationsOptimizations

Performance & IQ Performance & IQ OptimizationsOptimizations

Performance & IQ Performance & IQ OptimizationsOptimizations

Performance & IQ Performance & IQ OptimizationsOptimizations

Further DevelopmentFurther Development

• Reuse results from last N frame bufferReuse results from last N frame buffer• Efficiency/accuracy due to low frequency Efficiency/accuracy due to low frequency

nature of AOnature of AO

• Compute Irradiance transferCompute Irradiance transfer• Color bleeding with DDEColor bleeding with DDE

• Gather diffuse color samples transformed Gather diffuse color samples transformed with IR Energy transfer matrixwith IR Energy transfer matrix

• GI approxmiation with RDPEGI approxmiation with RDPE• Can use NdotL diffuse layer for IR Energy Can use NdotL diffuse layer for IR Energy

transfer approximation in whole scenetransfer approximation in whole scene

SummarySummary

• Main stepsMain steps• Calculate SSAOCalculate SSAO

• FR / Deffered SolutionFR / Deffered Solution

• RDPR / DDERDPR / DDE

• Filter ResultsFilter Results

• ComposeCompose

SummarySummary

• Our solution usesOur solution uses• FR DDEFR DDE

• Special image enhacement blocker function for Special image enhacement blocker function for edge highlightingedge highlighting

• Geosphere (18v on PC / 12v on X360) = 18/12 Geosphere (18v on PC / 12v on X360) = 18/12 samplessamples

• JitteringJittering

• Flow controlFlow control

• ¼ RT size for SSAO generation and filtering¼ RT size for SSAO generation and filtering

• Separable Bilateral FilterSeparable Bilateral Filter

• RGBA16 – RGB irradiance , A depthRGBA16 – RGB irradiance , A depth

SummarySummary

• SSAO done rightSSAO done right• Enhances your sceneEnhances your scene• Fakes lightmaps where they aren’tFakes lightmaps where they aren’t• Can be fastCan be fast• Can fake GI color bleedingCan fake GI color bleeding• Is only local thus viewdependantIs only local thus viewdependant

• But illusion is good enoughBut illusion is good enough

• Is a good start for future more complex Is a good start for future more complex scene lighting solutionsscene lighting solutions

SummarySummary

• With SSAO as GI approximationWith SSAO as GI approximation• You don’t need additional textures for You don’t need additional textures for

assetsassets• Pure diffuse (no baked AO)Pure diffuse (no baked AO)• Normal/height mapNormal/height map• SpecularSpecular• Additional special texturesAdditional special textures

• SSAO can substitute lightmaps in many SSAO can substitute lightmaps in many casescases

• Leave all lighting to your engineLeave all lighting to your engine

SummarySummary

For more information contact meFor more information contact me

hello@drobot.orghello@drobot.org

Slides, whitepaper and code will be Slides, whitepaper and code will be available atavailable at

Drobot.org Drobot.org

Everybody is welcome atEverybody is welcome at

Booth 20 Hall 5 GCBooth 20 Hall 5 GC

SummarySummary

Special thaks goes toSpecial thaks goes to

Mirosław DymekMirosław Dymek

Michał OrkiszMichał Orkisz

Mariusz SzaflikMariusz Szaflik

DemoDemo

Demo time :)Demo time :)

Fully RT SSAO Fully RT SSAO

‘‘AS IS’ in our game environmentAS IS’ in our game environment

WiPWiP

QuestionsQuestions

??

top related