3d#audio#headphones# - nanyang technological...
TRANSCRIPT
3D Aud
io Headp
hone
s
Woo
n-‐Seng Gan, Ee-‐Leng Tan,
Kaushik Sun
der, Jia
njun
He
1. Patented structure with strategic-positioned emitters;
2. Natural sound reproduction via frontal projection; No HRTF measurement or training required;
3. Producing important sound cues to recreate a realistic perception of the moving sound objects immersed with surrounding ambience;
4. Compatible with all existing sound formats.
Main Features
3D Audio Headphones Woon-‐Seng Gan, Ee-‐Leng Tan, Kaushik Sunder, Jianjun He
Performance
Time-‐frequency masking [1], Linear esFmaFon (PCA, LS) [2],
Time-‐shiLing [3], etc.
Dr. Gan Woon Seng [email protected] Dr. Ee-Leng Tan, [email protected] School of Electrical and Electronic Engineering, Block S2, S2-B4a-03, 50 Nanyang Avenue, Nanyang Technological University. S639798.
1. W. S. Gan and E. L. Tan, “Listening device and accompanying signal processing method,” US Patent 2014/0153765 A1, 2014.
2. K. Sunder, E. L. Tan, and W. S. Gan, "Individualization of binaural synthesis using frontal projection headphones," J. Audio Eng. Soc., vol. 61, no. 12, pp. 989-1000, Dec. 2013.
3. J. He, E. L. Tan, and W. S. Gan, “Linear estimation based primary-ambient extraction for stereo audio signals,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 2, pp. 505-517, Feb. 2014.
4. K. Sunder, E. L. Tan, and W. S. Gan, "On the study of frontal-emitter headphone to improve 3-D audio playback," in Proc. 133rd Audio Engineering Society Convention, San Francisco, Oct. 2012.
5. J. He, E. L. Tan, and W. S. Gan, “Time-shifted principal component analysis based cue extraction for stereo audio signals,” in Proc. ICASSP, Vancouver, Canada, 2013, pp. 266-270.
References
Contact Us A he
adph
one with
most n
atural 3D soun
d Headphone Structure
Conven1onal Headphones
3D Audio Headphones Lay Explana1on
Front-‐Back Confusion 70% 20%
50% improvement of front-‐back confusion
Localiza1on Accuracy
27% (Frontal DirecFons) 79.5 % (Rear DirecFons)
75% (Frontal DirecFons) 77% (Rear DirecFons)
50% improvement in idenFfying frontal sound objects
Externaliza1on NegaFve (Inside
the head) PosiFve (Outside
the head) Clear externalizaFon
of sound image
Sound Scene Extrac1on Algorithms
Conven1onal (PCA)
Proposed (ShiGed PCA) Lay Explana1on
Extrac1on Error -‐4 dB -‐14 dB
10 dB improvement in detecFng the right sound cue
Spa1al cue error
ITD Up to 1 ms Close to 0 ms Close to human ability to extract Fme and level cues for localizaFon ILD Up to 20 dB Up to 4 dB
3D Audio Headphones Woon-‐Seng Gan, Ee-‐Leng Tan, Kaushik Sunder, Jianjun He
Rendering for 3D Audio Headphones
Sound scenes in movies and games often consist of: directional sound objects and background ambience, e.g., • Circling aircraft in the rain; • Flying bee at the waterfall; • Soaring tiger on the sea; • Attacking wolves in the forest.
Primary-Ambient
Extraction (PAE)
Direc1onal sound
Background ambience
Front emiIers
All emiIers
Correct sound localiza1on
Immersive sound environment
Digital Media
3D Audio Headphones 1) Human hearing is highly idiosyncratic. Human pinna can be considered as an “acoustic fingerprint” 2) Sound waves undergoes reflections, and diffractions with the pinna cavities on its path to the eardrum 3) The interaction with the external ear cause unique direction dependent spectral patterns at the eardrum
Front-Back Confusions
Non-Individualized
HRTFs
Non-
Individualized Headphone Equalization
In-head
localization
Imperfect 3D Sound
Main Challenges
Binaural Recording 1) To capture the true spatial quality of the sound, recordings are carried out by placing tiny probe microphones either at the eardrum or the blocked ear canal 2) Recordings can be done either on a human subject (Individualized) or a dummy head (Non-Individualized)
Use of non-individualized binaural recordings will degrade the veracity of the sound
Solution : Individualization of the binaural recordings using frontal projection
3D Audio Headphones: Frontal Projection
Conchae
Features of Frontal Projection : 1) Embeds individualized pinna cues automatically during playback 2) No individualization measurements required 3) Reduces Front-Back Confusions 4) Lesser timbral coloration
EQ Front
EQ Diffuse