progressive perceptual audio rendering of complex scenes thomas moeck - nicolas bonneel - nicolas...

29
Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon - David Alloza REVES/INRIA Sophia-Antipolis 2- Computer Graphics Group, University of Erlangen-Nuremb 3- CNRS-UPMC UMR 7593 4- EdenGames 1,2 1 1 1 3 4

Upload: kailey-hindley

Post on 31-Mar-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Progressive Perceptual Audio Rendering of Complex Scenes

Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon - David Alloza1- REVES/INRIA Sophia-Antipolis 2- Computer Graphics Group, University of Erlangen-Nuremberg

3- CNRS-UPMC UMR 7593 4- EdenGames

1,2 1 1 1 3 4

Page 2: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Objectives

Efficient audio rendering of very complex scenes with moving sources

Without audible impairment of the quality

Verify results by user tests

Page 3: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Previous Work Rendering complex auditory scenes

Clustering [Tsingos et al. 2004]: replace many sources with a representative

Still can only treat ~200 sound sources (cost of clustering itself)

Scalable audio processing Importance-guided processing of few frequency/time bins [Fouad et

al. 1997, Wand & Straßer 2004, Gallo et al. 2005, Tsingos 2005]. Audio processing (e.g., HRTF, spatialization) is expensive

Crossmodal effects Neuroscience Literature: “Ventriloquism affects 3D audio

perception” Ventriloquism spatial window can vary from a few up to 15 degree Few papers on ecological experiments

Page 4: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Methodology

Recursive approach to clustering Reduce cost of clustering

Scalable perceptual premixing Faster premixing without audible loss of quality

Taking perceptual and cross-modal information into account Improve audio clustering algorithm

User experiments to detect improvement possibilities Improving quality with results of tests

Validation of resulting algorithms

Page 5: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Overview of the algorithms

Masking of inaudible sources (with energy) Clustering of remaining sources Progressive premixing within each cluster Spatial audio processing (HRTF)

recursive

Page 6: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Our Work

Optimized recursive approach of clustering Clustering performance evaluation

Improved scalable perceptual premixing Quality evaluation study

Study of cross-modal effects by user experiments Using results of cross-modal studies to develop

audio-visual clustering algorithm

Page 7: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Optimized Recursive Clustering

Recursive splitting of clusters Fixed-budget approach

Using a fixed number of clusters Variable-budget approach

Splitting clusters until break condition is reached Break condition: Average angle error Optimal number of clusters

Variant used by EdenGames 8 cluster budget Local clustering when necessary

Page 8: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Eden Games’ implementation Test Drive Unlimited

Page 9: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Clustering Performance Evaluation

Performance of recursive algorithms are clearly better

Page 10: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Improved progressive scalable perceptual premixing (1)

After clustering: Premixing in each cluster Why? Effects can be done afterwards -

less cost because viewer signals Only premixing necessary data

Assigning frequency bins to sound sources (iterative importance sampling) by using pinnacle value

Page 11: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Improved progressive scalable perceptual premixing (2)

premixing

clustering

Page 12: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Improved progressive scalable perceptual premixing (3)

Iterative importance sampling Calculation of importance value from

energy, loudness or audio saliency map Assignment of frequency proportional to

importance until pinnacle value is reached

Reassignment of remaining frequencies to sounds relative to importance values

Page 13: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Varying budget

Page 14: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Quality Evaluation Study (1)

MUSHRA (“Multiple Stimuli with Hidden Reference and Anchors”) test of perceptual premixing

7 subjects, aged from 23 – 40 Ambient, music and speech Various budgets (2% – 25 %) With and without pinnacle value Using loudness or saliency as

importance value

Page 15: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Quality Evaluation Study (2)

Results: Approach is capable of

generating high quality using 25% of the original data

Acceptable results with 10% (2% in case of speech)

Significant Effects: Budget Importance value Pinnacle value

Page 16: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Study of Cross-Modal Influences – Questions

Do we need more or fewer clusters in the viewing frustum? We move spatial position of sound

sources to representative in cluster How tolerant are we to this error ?

Do visuals influence the perceived quality?

Page 17: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Study of Cross-Modal Influences – Setup (1)

Page 18: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Study of Cross-Modal Influences – Setup (2)

Page 19: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Study of Cross-Modal Effects – Setup (3)

Page 20: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Uniform distribution [1/4]

Page 21: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

[2/3] condition

Page 22: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

[3/2] condition

Page 23: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

[4/1] condition

Page 24: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Study of Cross-Modal Influences – Results

Statistical analysis of the results shows: We need more clusters

in the viewing frustum No significant difference

of visuals/no-visuals but possible cross-modal effect

Page 25: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Modifying the algorithm

Introducing weighting term in clustering:

Increasing number of clusters in the viewing frustum

Page 26: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Cross-Modal illustration

Page 27: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Video: Putting it all together

Page 28: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Conclusions

Up to nearly 3000 sound sources possible in good quality Main limitation are graphics (!)

Better quality because more clusters in viewing frustum

Future work experiment with auditory saliency

measurements handle procedurally synthesized sounds?

Page 29: Progressive Perceptual Audio Rendering of Complex Scenes Thomas Moeck - Nicolas Bonneel - Nicolas Tsingos - George Drettakis - Isabelle Viaud-Delmon -

Questions?