cross-modal soundscape mappingsymphony.arch.rpi.edu/~braasj/.inside/.carter2013/j... ·...

1
+ QTVR + HARPEX The present research project linking visual and acoustical documentation of environmental condi- tions is being conducted in fulfillment of the M.S. in Architectural Sciences at RPI and is due to be completed in the summer of 2013. Architectur- al Acoustics provides the framework to address the built environment’s impact on acoustic ecolo- gy, and this project aims to provide designers with new tools for improved site and context analysis. THE PROBLEM Much of this work involves developing an au- dio-visual mapping methodology, but there is also a case study underway to test the techniques. Just north of Troy, at the intersection of the Mohawk and Hudson Rivers, lies the area’s largest state park: Peebles island. Although it is secluded and surrounded by several significant sources of aes- thetically desirable noise in the form of river rap- ids and waterfalls, such sounds of ‘nature’ and the tree canopy do very little to mask the profound industrial noise presence of the Mohawk Paper processing plant nearby. How to sufficiently con- vey these complex and difficult acoustical rela- tionships between desired and undesired noise sources in an otherwise copascetic visual setting? The QuickTime VR format provides an ac- cessible, portable viewer for delivering the scene with interactive navigation. High dynamic range image sequences are combined in a spherical panoramic series to record the complete visual field. HDR SPHERICAL PANORAMA AMBISONIC AUDIO That is further processed using High Angu- lar Resolution Planewave Expansion to de- rive 26 directional audio feeds. Field recordings captured with the Tetramic A-format are converted into basic B-format to provide 1st order Ambisonic audio. 360° 180° Full-range integration of visual and aural fields, in an easily accessed and ‘steerable’ format which is simple enough to be shared via email. cardioid capsules tetrahedral layout 4-channel A-format W’ (omni) = FLU+FRD+BLD+BRU X’ (front/back) = FLU+FRD-BLD-BRU Y’ (left/right) = FLU-FRD+BLD-BRU Z’ (up/down) = FLU-FRD-BLD+BRU FLU: front left upper FRD: front right down BLD: back left down BRU: back right upper 4-channel B-format HARPEX process = 26 discreet channels The author holds a BA in Integrated Arts from Bard College, where he studied experimental music composition and non-narrative filmmak - ing with composer and theorist Benjamin Boretz. Having trained with master luthier Norman Reed in Devon, UK, he has designed and constructed sever- al guitars, electronic instruments, and loudspeak- er systems. He was the first student to complete the dual degree Masters in Architecture and MFA in Architectural Lighting Design at Parsons The New School for Design in NYC. He grew up in South Car- olina, spent formative time in Russia and Venice, and has lived most of his adult years in Chicago and along the Hudson River. He enjoys teaching and lis- tening, and believes passive solar design will save us. ABSTRACT The Architecture 2030 Challenge maintains that the constructed environment is responsible for near- ly half of the world’s energy consumption. Architects therefore have a crucial role to play in addressing cli- mate change. Criteria for sustainable design practice are rapidly evolving, with considerations for ecolog- ical impacts becoming increasingly important. The focus tends to be on fossil fuel depletion, but sev- eral other areas need attention. Raising awareness about environmental acoustics requires a way of representing field data which is accessible, immer- sive, and multi-sensory. To facilitate productive dis- cussion between acousticians, ecologists, planners, designers, and citizens, this research project is devel- oping a method of integrating high-dynamic-range spherical panoramic photography with interactive Ambisonic audio information in order to gener- ate interactive maps of environmental conditions. INTERDISCIPLINARY RESEARCH Previous research interests have focused on phenomena perilously overlooked by visual rep- resentation in design. The author’s position on ar- chitectural representation holds that many of the deleterious impacts of the built environment on cli- mate change can be directly linked to blatant over- emphasis on ocularcentrism in the discipline. Sim- ply put: an architect tends to neglect those effects and relationships which are not seen and are there- fore difficult to draw. Such crucial phenomena as heat gain/loss, darkness adaptation, thermal com- fort, noise propagation, weather effects on mate- rials, and the passage of time itself are difficult to convey or discuss using only visual abstractions. THE METHOD As criteria for sustainable design practice rap- idly evolve, raising awareness about issues of acoustic ecology—which can account for both in- door and outdoor sonic environments—requires a way of representing in situ data that is accessible, immersive, and multisensory. The current state of noise-mapping (see below) relies on one-di- mensional metrics and outdated modes of car- tographic abstraction (color-coding), and simply does not convey a soundscape’s temporal, spectral, or most importantly, its contextual complexities. Multisensory environmental context, after all, largely determines whether a local sound is welcome or not. Unlike noise-mapping, however, mapping the visual environment has evolved significantly in re- cent years, with the advent of extensive satellite and ‘streetview’ photography supplanting the tradition- al symbolized abstractions of cartography. A line representing a street is hardly as information-rich as embedding actual photographic material which is easily accessed and navigated by the general public. How might acoustic ecologists achieve this level of interactive documentation with our diverse sonic environments? Could we not supplement increas- ingly popular spherical panoramic photography with a layer of ‘steerable’ ambisonic audio, using recent developments in B-format processing to de- rive higher angular resolution for user navigation? To meaningfully address the complexities of our dynamic and interwoven sonic environments, it is necessary to acknowledge the vast range of acoustic knowledge (or lack thereof) among stake- holders and policy-makers. In order to facilitate productive discussion among acousticians, ecolo- gists, planners, designers, and citizens, this project is developing such a method to document, seam- lessly link, and represent the full-range visual and aural fields. These interactive documents will be simple to access and navigate, and will not resort to metrical abstraction to represent field data. WYSAHIWYG = What You See And Hear Is What You Get! CONSTRUCTED vs. ‘NATURAL’ As in other contemporary discussions about urban ecology, where the human-made construct- ed environment is not treated as somehow dis- tinct from ‘nature’, it is important to recognize that soundscapes both ‘natural’ and ‘unnatural’ are in- tricately interdependent and suffused with vi- bration which traverses both. The very nature of sound propagation is such that various sources at even great distances apart combine to create the rich tapestries of our soundscapes. Attempting to isolate sounds-- or more accurately, the resulting acoustic effects of sounds propagating within and because of an environment-- is highly complicated. Understanding the relationships of various sound sources within context (regardless of their origin) is important to developing more sensitive design approaches to archiving, preserving, conserving, repairing, or constructing meaningful soundscapes. BEYOND STREET VIEW Navigation has enjoyed a paradigm shift in re- cent years, as global information agents such as Google and Bing have leveraged a wealth of both satellite and street-level photography to replace the formerly flat graphics we have been accustomed to for generations. The techniques being developed in this research project could easily be integrated into such current extensive gathering efforts. Imagine if Google’s Streetivew also included local soundscapes! Environmental maps incorporating the inter- active documents generated in this project will provide the basis for studies which will both doc- ument the particularly complicated soundscapes of Peebles Island, as well as allow us to evaluate cross-modal perception models using real context. Ultimately, the goal is to create an effective and convincing advocacy instrument for analyzing and addressing noise pollution pervasive in the built environment which is threatening a variety of en- dangered habitats. Truly sustainable design must do more than address energy consumption con- cerns, but must also address ecological impacts resulting from rapidly deteriorating soundscapes. X Y Z + + + _ _ _ front X up left back right down W Direction estimation Input mode vector Output mode vector Matrix inversion Matrix multiplication Matrix multiplication Input signal vector Output signal vector n m n x m n n x n n x m n x n Smoothing n x m x n x n x n x m 4 x + = FLU FRD BRU BLD CROSS-MODAL SOUNDSCAPE MAPPING J . P A R K M A N C A R T E R M.S. STUDENT Graduate Program in Architectural Acoustics ADVISOR : J O N A S B R A A S C H DIRECTOR, CENTER FOR COGNITION, COMMUNICATION, AND CULTURE MOHAWK PAPER PLANT PEEBLES ISLAND STATE PARK

Upload: others

Post on 10-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CROSS-MODAL SOUNDSCAPE MAPPINGsymphony.arch.rpi.edu/~braasj/.inside/.Carter2013/J... · 2019-07-02 · oping a method of integrating high-dynamic-range spherical panoramic photography

+

QTVR + HARPEXQTVR + HARPEX

The present research project linking visual and acoustical documentation of environmental condi-tions is being conducted in fulfillment of the M.S. in Architectural Sciences at RPI and is due to be completed in the summer of 2013. Architectur-al Acoustics provides the framework to address the built environment’s impact on acoustic ecolo-gy, and this project aims to provide designers with new tools for improved site and context analysis.

THE PROBLEM

Much of this work involves developing an au-dio-visual mapping methodology, but there is also a case study underway to test the techniques. Just north of Troy, at the intersection of the Mohawk and Hudson Rivers, lies the area’s largest state park: Peebles island. Although it is secluded and surrounded by several significant sources of aes-thetically desirable noise in the form of river rap-ids and waterfalls, such sounds of ‘nature’ and the tree canopy do very little to mask the profound industrial noise presence of the Mohawk Paper processing plant nearby. How to sufficiently con-vey these complex and difficult acoustical rela-tionships between desired and undesired noise sources in an otherwise copascetic visual setting?

The QuickTime VR format provides an ac-cessible, portable viewer for delivering the scene with interactive navigation.

High dynamic range image sequences are combined in a spherical panoramic series to record the complete visual field.

HDR SPHERICAL PANORAMA AMBISONIC AUDIO

That is further processed using High Angu-lar Resolution Planewave Expansion to de-rive 26 directional audio feeds.

Field recordings captured with the Tetramic A-format are converted into basic B-format to provide 1st order Ambisonic audio.

360°

180°

Full-range integration of visual and aural fields, in an easily accessed and ‘steerable’ format which is simple enough to be shared via email.

cardioid capsules tetrahedral layout

4-channel A-format

W’ (omni) =FLU+FRD+BLD+BRU X’ (front/back) =FLU+FRD-BLD-BRU Y’ (left/right) =FLU-FRD+BLD-BRU Z’ (up/down) =FLU-FRD-BLD+BRU

FLU:front left upper FRD:front right down BLD:back left down BRU:back right upper

4-channel B-format

HARPEX process = 26 discreet channels

The author holds a BA in Integrated Arts from Bard College, where he studied experimental music composition and non-narrative filmmak-ing with composer and theorist Benjamin Boretz.

Having trained with master luthier Norman Reed in Devon, UK, he has designed and constructed sever-al guitars, electronic instruments, and loudspeak-er systems. He was the first student to complete the dual degree Masters in Architecture and MFA in Architectural Lighting Design at Parsons The New School for Design in NYC. He grew up in South Car-olina, spent formative time in Russia and Venice, and has lived most of his adult years in Chicago and along the Hudson River. He enjoys teaching and lis-tening, and believes passive solar design will save us.

ABSTRACT

The Architecture 2030 Challenge maintains that the constructed environment is responsible for near-ly half of the world’s energy consumption. Architects therefore have a crucial role to play in addressing cli-mate change. Criteria for sustainable design practice are rapidly evolving, with considerations for ecolog-ical impacts becoming increasingly important. The focus tends to be on fossil fuel depletion, but sev-eral other areas need attention. Raising awareness about environmental acoustics requires a way of representing field data which is accessible, immer-sive, and multi-sensory. To facilitate productive dis-cussion between acousticians, ecologists, planners, designers, and citizens, this research project is devel-oping a method of integrating high-dynamic-range spherical panoramic photography with interactive Ambisonic audio information in order to gener-ate interactive maps of environmental conditions.

INTERDISCIPLINARY RESEARCH Previous research interests have focused on phenomena perilously overlooked by visual rep-resentation in design. The author’s position on ar-chitectural representation holds that many of the deleterious impacts of the built environment on cli-mate change can be directly linked to blatant over-emphasis on ocularcentrism in the discipline. Sim-ply put: an architect tends to neglect those effects and relationships which are not seen and are there-fore difficult to draw. Such crucial phenomena as heat gain/loss, darkness adaptation, thermal com-fort, noise propagation, weather effects on mate-rials, and the passage of time itself are difficult to convey or discuss using only visual abstractions.

THE METHOD

As criteria for sustainable design practice rap-idly evolve, raising awareness about issues of acoustic ecology—which can account for both in-door and outdoor sonic environments—requires a way of representing in situ data that is accessible, immersive, and multisensory. The current state of noise-mapping (see below) relies on one-di-mensional metrics and outdated modes of car-tographic abstraction (color-coding), and simply does not convey a soundscape’s temporal, spectral, or most importantly, its contextual complexities.

Multisensory environmental context, after all, largely determines whether a local sound is welcome or not. Unlike noise-mapping, however, mapping the visual environment has evolved significantly in re-cent years, with the advent of extensive satellite and ‘streetview’ photography supplanting the tradition-al symbolized abstractions of cartography. A line representing a street is hardly as information-rich as embedding actual photographic material which is easily accessed and navigated by the general public.

How might acoustic ecologists achieve this level of interactive documentation with our diverse sonic environments? Could we not supplement increas-ingly popular spherical panoramic photography with a layer of ‘steerable’ ambisonic audio, using recent developments in B-format processing to de-rive higher angular resolution for user navigation? To meaningfully address the complexities of our dynamic and interwoven sonic environments, it is necessary to acknowledge the vast range of acoustic knowledge (or lack thereof) among stake-holders and policy-makers. In order to facilitate productive discussion among acousticians, ecolo-gists, planners, designers, and citizens, this project is developing such a method to document, seam-lessly link, and represent the full-range visual and aural fields. These interactive documents will be simple to access and navigate, and will not resort to metrical abstraction to represent field data.

WYSAHIWYG = What You See And Hear Is What You Get!

CONSTRUCTED vs. ‘NATURAL’

As in other contemporary discussions about urban ecology, where the human-made construct-ed environment is not treated as somehow dis-tinct from ‘nature’, it is important to recognize that soundscapes both ‘natural’ and ‘unnatural’ are in-tricately interdependent and suffused with vi-bration which traverses both. The very nature of sound propagation is such that various sources at even great distances apart combine to create the rich tapestries of our soundscapes. Attempting to isolate sounds-- or more accurately, the resulting acoustic effects of sounds propagating within and because of an environment-- is highly complicated. Understanding the relationships of various sound sources within context (regardless of their origin) is important to developing more sensitive design approaches to archiving, preserving, conserving, repairing, or constructing meaningful soundscapes.

BEYOND STREETVIEW

Navigation has enjoyed a paradigm shift in re-cent years, as global information agents such as Google and Bing have leveraged a wealth of both satellite and street-level photography to replace the formerly flat graphics we have been accustomed to for generations. The techniques being developed in this research project could easily be integrated into such current extensive gathering efforts. Imagine if Google’s Streetivew also included local soundscapes!

Environmental maps incorporating the inter-active documents generated in this project will provide the basis for studies which will both doc-ument the particularly complicated soundscapes of Peebles Island, as well as allow us to evaluate cross-modal perception models using real context. Ultimately, the goal is to create an effective and convincing advocacy instrument for analyzing and addressing noise pollution pervasive in the built environment which is threatening a variety of en-dangered habitats. Truly sustainable design must do more than address energy consumption con-cerns, but must also address ecological impacts resulting from rapidly deteriorating soundscapes.

X

Y

Z

+

+

+ _

_

_

frontX

up

left back

right

down

W

Proc. of the 2nd International Symposium on Ambisonics and Spherical Acoustics May 6-7, 2010, Paris, France

Angle between sources / deg

Azi

mut

h / d

eg

−150 −100 −50 0 50 100 150

−150

−100

−50

0

50

100

150

Figure 4: Directional degeneracy

HARPEXPanningfunction

Matrixmultiplication

2 × n

× 4

× n

complexamplitudes

B-format

speakerfeeds

directionvectors

Figure 5: Simple decoder

4. DECODER IMPLEMENTATION

The most straightforward implementation of a decoder usingHARPEX would be to send each of the two direction estimatesinto a panning function, which would return a weight for eachof the output speakers. Each weight would then be multipliedwith the complex amplitude of the corresponding planewave togenerate speaker feeds. This approach is illustrated in Figure 5,not showing windowing, FFT and IFFT, which would also benecessary.

There are several problems with this implementation.Firstly, the HARPEX method does not always return a solutionand must be accompanied with a fallback method to use in suchcases. Of the known useful fallback methods, none provide morethan a single direction estimate.

Secondly, the direction vectors may change rapidly from oneframe to the next, causing time domain artifacts related to theframe period. One solution is to smooth the direction vectors,but a better solution is to smooth the resulting panning weights.

Thirdly, the direction vectors may differ significantly fromone frequency bin to the next within a frame, causing dispersionand undesirably soft transients. Smoothing the direction vectorsacross the frequency axis can solve this problem, but again it isbetter to smooth the panning weights instead.

One effect of smoothing is to introduce leakage betweensources that have been separated. For diffuse sources, this leak-age is desirable, but for point sources, the amount of smoothingrepresents a trade-off between the sharpness of localization andthe audibility of artifacts.

Whether the smoothing is done before or after the panningfunction, the decomposition into two plane waves is no longer

Directionestimation

Inputmode vector

Outputmode vector

Matrixinversion

Matrixmultiplication

Matrixmultiplication

Inputsignal vector

Outputsignal vector

n

m

n x m

n

n x n

n x m

n x n

Smoothing

n x m

x nx n

x n x m

Figure 6: Complete decoder

valid, since the direction of the waves is altered. To regain avalid decomposition, another two planewaves can be added andthe signal must be decomposed into this new basis. In caseswhere HARPEX returns no solution, three new planewaves mustbe added, so that the second decomposition always returns fourwaves.

To ensure good conditioning of the decoding matrix, the ad-ditional planewaves should be placed as far away as possiblefrom the original planewaves and each other.

4.1. Panning functions

Since decomposition and resynthesis is split into two separateoperations, any panning function can be used. The most obviouschoice for horizontal loudspeaker layouts would be a pairwisepanning, using a panlaw in the 3–6 dB range. This can eas-ily support irregular layouts, and can be extended to with-heightlayouts using vector-based amplitude panning [6].

Other functions worth mentioning are ambisonics-equivalent panning functions [7] and wavefield synthesis[8]. Another interesting option is to use spherical harmonics aspanning functions. The output of the decoder will in this casenot be loudspeaker feeds, but rather an up-mix of first-orderB-format to higher-order B-format. This panning function hasthe desirable property of reconstructing the sound field in thesweet spot, if combined with a suitable higher-order ambisonicdecoder.

Decoding for binaural playback using head-related transferfunctions presents additional challenges stemming from the factthat HRTFs contain phase terms that encode the interaural timedelay. These phase terms can lead to audible artifacts unlessfurther processing is undertaken. This will be the subject of afuture publication.

4x +

=FLU FRD

BRUBLD

CROSS-MODAL SOUNDSCAPE MAPPINGJ . P A R K M A N C A R T E R M.S. STUDENT

Graduate Program in Architectural Acoustics

ADVISOR : J O N A S B R A A S C H DIRECTOR, CENTER FOR COGNITION, COMMUNICATION, AND CULTURE

MOHAWK PAPER PLANT

PEEBLES ISLAND STATE PARK