cinema requirements, calibrated listening, dialog. · cinema requirements, calibrated listening,...

10
Gruppo Tematico per la Cinematografia Sonora Cinema requirements, calibrated listening, dialog. English version Simone Corelli, GTCS * June 15, 2011 Abstract These are some notes for the intervention on the TC Electronics work- shop “Loudness Authority” held in Rome 9 and 10 June 2011, in Discoteca di Stato. It was accompanied by some slides created with Apple Keynote (in this pdf I included only selected images). Some infos and images are from the book “Elementi di Cinematografia Sonora”. The name of this file is “LoudnessAuthority2011Corelli”. The source was compiled in L A T E X2ε obtaining this PDF. 1 Presentation Good morning everyone, I’m Simone Corelli and here I represent, besides myself as a rerecording sound mixer for cinema and television belonging to the PLOUD of the EBU, the Gruppo Tematico per la Cinematografia Sonora (Thematic Group for Cinema Sound), created by some members of the Audio Engineering Society, Italian Section, five years ago. The GTCS joins professionals united by the wish of increasing the interest and the studies around cinema and television sound, both to favour the workflow optimization and regarding the quality perceived by the spectators at the end of the chain. We are also experimenting about surround recordings and linguistic issues of the sound for 3D movies, as well as developing a sound evaluation form, useful for the juries of film awards. Furthermore we want to mention the didactic activity, with the publication of a book about the cinema sound workflow and the collaboration with schools and universities. After the one of today and tomorrow the next events organized with the collaboration of the GTCS will be the tenth workshop on the technologies for * email: [email protected], website: http://www.gtcs.it 1

Upload: doandung

Post on 01-Apr-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

Gruppo Tematico per la Cinematografia Sonora

Cinema requirements, calibrated listening, dialog.English version

Simone Corelli, GTCS∗

June 15, 2011

AbstractThese are some notes for the intervention on the TC Electronics work-

shop “Loudness Authority” held in Rome 9 and 10 June 2011, in Discotecadi Stato. It was accompanied by some slides created with Apple Keynote(in this pdf I included only selected images). Some infos and images arefrom the book “Elementi di Cinematografia Sonora”. The name of this fileis “LoudnessAuthority2011Corelli”. The source was compiled in LATEX2εobtaining this PDF.

1 PresentationGood morning everyone, I’m Simone Corelli and here I represent, besides myselfas a rerecording sound mixer for cinema and television belonging to the PLOUDof the EBU, the Gruppo Tematico per la Cinematografia Sonora (ThematicGroup for Cinema Sound), created by some members of the Audio EngineeringSociety, Italian Section, five years ago.

The GTCS joins professionals united by the wish of increasing the interestand the studies around cinema and television sound, both to favour the workflowoptimization and regarding the quality perceived by the spectators at the end ofthe chain. We are also experimenting about surround recordings and linguisticissues of the sound for 3D movies, as well as developing a sound evaluation form,useful for the juries of film awards.

Furthermore we want to mention the didactic activity, with the publicationof a book about the cinema sound workflow and the collaboration with schoolsand universities.

After the one of today and tomorrow the next events organized with thecollaboration of the GTCS will be the tenth workshop on the technologies for

∗email: [email protected], website: http://www.gtcs.it

1

Figure 1: Human voice spans nearly 50 dB from a whisper to a scream.

music of La Sapienza University which will be held on the twentysecond of thismonth, with an evening concert at the Santa Cecilia Conservatory , and later anorgan concert with surround recording forseen for october.

Going back to the main topic of this presentation, it is obviously impossibleto speak adequately about rerecording sound mix in half an hour. So, given themain subject of this two-day-worskhop is about “loudness and levels” we willfocalise on some points correlated to them.

2 Cinema alignmentLet’s begin with the question of alignment. As you know this alignment imposesto obtain 85 dB of sound pressure level (C weighted) on each of the three frontchannels, 3 dB less on the surround ones and 10 dB more on the LFE by usinga pink noise stimulus, limited in the audio band, of RMS equal to that of a sinewhose peak is -20 dB full scale digital.

It has to be noted that the monitoring system, usually indicated as B-chain,must show a particular frequency response called X-curve, that takes into accountsome high-frequency attenuation even due to the contribution of the reverberationfield. This is just to remind that it is not so immediate to replicate this standardto different acoustical environments.

Strangely, using cinema alignment, the sample-peak level of a typical dialog,calm and quiet, with the character in close-up, without audio compression reacheseasily -5 dB FS. And the human voice can reach much more , and so can manysimple sounds like clapping.

So it’s clear that at the source we need a better headroom, without which“clipping” will be inevitable, especially on the set, where a door closing, platesand cutlery in a kitchen, some drops of rain on particularly hard surfaces orsimply a quarrel scene oblige the recordist to strongly attenuate the microphonegain and therefore to loose the correct proportions between different parts of themovie.

At the end of the chain, at the mixing stage, obviously we make use of somecompression/limiting or of other tricks to reduce exagerated peaks maintaining

2

the desired loudness, or, for the bass, we make use of the LFE channel thatguarantees, as we said, an extra-headroom of ten decibels in its own band.

It isn’t a good idea to apply such compression/limiting directly when record-ing, because only after the scene editing has been done will we discover the levelneeds of every sound-source, depending on the chosen shot.

3 GTCS alignmentSince 2008, when we presented the first document about workflow optimization,our group has been suggesting a special alignment to be applied in the productionstages, before final mixing, that improves the headroom by some dB. After someexperimentation we found that 8 dB more than cinema alignment is a wise,balanced compromise between the real needs and the possibility not to throwaway current devices.

This special alignment (we can call it “85 dB SPL at -28 dB FS” or “GTCSalignment”) thanks to the universal adoption of 24 bits doesn’t degrade verymuch the ratio between signal and the quantization, because the loss is onlyslightly more than 1 bit out of the eight gained with the latest technology leap(end of DAT era); we must also consider that according to some studies thehuman ear seems to require only 20 bits. Also, most of the movie sound isrepresented by dialog, whose quality suffers by much more serious bottlenecks,especially these years with the widespread use and abuse of radiomicrophones.

The main beneficial effects of enhancing the headroom reside in the greatreduction of the risk of clipping during recording and in the capacity to maintaina more constant recording gain from the beginning to the end of the shootingwithout the need to set it scene by scene, except for very rare cases. At themixing stage it will therefore be possible to concentrate on adapting the levelsto the scene needs, instead of worrying about disuniformities on the gain ofrecorded material.

We can succesfully apply this alignment to the recording of dialogs on theset, at the dubbing stage, in the recording studio with orchestra, and at thefoley-studio.

Let’s look closer at what could happen mixing a high-dynamic dubbed dialogin which the recordist has modified the recording gain sentence by sentence: firstof all the replay of the whole dialog, for a check, with all the characters recorded,becomes complex to manage because of the simultaneous presence of differentstandards in recording level (who shouts loudly was recorded by lowering thegain and viceversa), with a loss in the right dynamic relationships. Subsequentlythis material, taken for the final mix, will require operations that greatly slowdown the job. To better understand, it must be said that usually it’s advisableto set the “sends” as pre-fader in the treatment of film dialogs, given that thereverberation stays more or less constant as the distance varies, while it is thedirect signal that changes.

So it’s clear how simple it is to move a speaker closer or further, acting on themain fader only, if we can trust that the dialog has been recorded with constantgain. On the contrary we will have to add two operations compensating the gainvariation on the reverberation send and doing the same on the main fader, allthis by ear.

It’s a serious nuisance, if you consider the amount of different takes we have

3

to deal with, in a typical movie: on average about 700! So, it’s a great advantageto use such an alignment, as we suggest, having experienced it by ourselves inthe last three years.

3.1 GTCS reference audio fileIn 2009 the GTCS produced a test-signal useful for empirically calibrating themonitoring system applying the proposed -28 dB alignment. It’s not necessary tocalibrate very precisely given that all the produced material will be subsequentlymixed, but if you want to, you can use a sound level meter to reach 85 dB SPL(C weight) using a pink noise at -28 dB FS RMS (= −20 − 8).

The above mentioned audio file is named GTCSref2009beta1.wav and it’savailable on our website www.gtcs.it.

1. It starts with an alignment tone of 90 seconds, with a peak of -28 dB FS.

2. Thence, there is a pop which lasts a single frame at 24 fps. The samehappens at the end.

3. Two seconds later a triangle sound introduces and ends a speaker describinghow to calibrate the monitoring level using his voice as a reference. Atthe end there’s a ghost signal recorded 96 dB lower to check the chainand verify that the 24 bits have passed harmlessly. Notice that it isn’tnecessary to apply equalization because the omnidirectionl microphonewasn’t affected by the proximity-effect.

Regarding the requirements of loudspeakers, amplifiers and headphones, wehave checked some of the most popular devices, like the AAton Cantar, or theSound Devices 744, and we can state that we haven’t had any problems. In anycase we must observe that:

1. For headphone listening it would be advisable to have a protection againstsudden noises like those caused by boom operator’s handling, clothesscratching the lavalier microphones, or radiotransmission interferences.Without these problems the levels are reproduced as if the recordist werelistening to the reality using his own ears.

2. Most of the near-field monitors can conform to our alignment but otherwiseit’s very simple (at least on Pro Tools systems) to set a limiter on themonitoring output to increase the level and protect against peaks, withoutaffecting the recorded signal.

3. We must admit that it’s much more difficult to achieve the adjustmentof the listening level on the video-workstations the scene-editors makeuse of, traditionally equipped with poor loudspeakers. An external audiocompressor will help a lot. But it’s even possible to give a guide track,band limited and compressed, to help the building of the movie with asimplified version of the dialog, before the sound editor work. It would notbe a problem: even now it’s common that the sound editor reconform themovie using the hi-res source audio, complete with all the channels.

4

4 My template for mixing moviesNow I would like to dedicate some minutes to showing you my universal mixingsession, based on a Pro Tools ICON system.

As you may know mixing a movie is a little different from mixing music: onlyconsidering the production sound dialogs, typically spanned over 16 or 24 faders,we have to deal with about a thousand audio fragments requiring hundreds ofdifferent treatments in terms of equalization and artificial reverberation. Goingon to the backgrounds, special effects and foleys we can count over 40 faders,and up to 24 faders for music tracks. We must add masters, recording tracks,monitoring and encoding/decoding systems reaching a high degree of complexity,nowadays reduced by things like automation and fader paging.

The Pro Tools template I use is designed so that all the stems which can berecorded, both 2 and 5.1 tracks, are perfectly time-aligned among themselvesand also with the electronic or film projector. This has been fine-tuned with thehelp of a custom made sync meter.

It is well protected against clipping, thanks to headroom enhancing in specialkey points: Pro Tools feature a 48 dB extra-headroom on busses, but obviouslythe audio files and converters don’t, and less obviously even plugins clip over 0dB FS. So, thanks to master faders and limiters I added twelve dB in criticalpoints.

This session also enables to mix and record simultaneously in 5.1 and inDolby Surround and it is designed both for original movies and for dubbedmovies.

Another particular characteristic is the 7 parallel reverberators independentlyaccessible by two sends and interconnected, a useful feature for sequences wherepeople communicate between two places by doors, windows, phones or radiosand it’s a way to rapidly recall presets, that I sorted by the dimension of thesimulated acoustic space.

Regarding equalization, if someone’s interested, this is the standard configu-ration I start from, to treat dialogs:

High-pass usually active by default, it’s set to have a cut at about 45 Hz and a18 dB/octave pendenza slopiness. It’s possible to increase the cut frequencyif you need to reduce traffic noise or generator set. On very high qualitysound it’s possible to lower the freq. cut.

Bass shelving set with a frequency of 250 Hz, Q=0.50, useful to modulate thewarmth of the voices.

Notch on bass to reduce hum or chest resonances on male voices picked upwith lavalier microphones. A good value to start from is 150 Hz.

band-pass for boxiness centered on 450 Hz with a starting point of Q=1.

band-pass for nasal sound or some boxiness, with a narrow Q=5.5 and fre-quency set to about 5 KHz. For nasal sounds try to lower the frequencyaround 1 KHz or little more. It must obviously be tuned to target thespecific problem. To reduce light whistles it must be set around 10 KHz ormore. It’s even useful to reduce sibilants.

5

Figure 2: The sound of a movie before mixing (Pro Tools, edit window).

6

Figure 3: The sync-meter made by Marco Montanari: it sees the light changesand modulate a 2000 Hz tone, simultaneously picking the sound with a mic.Using a POP and recording the two signals it’s easy to measure the audio-videodelay.

high-frequency shelving useful to brighten an out-of-axis boom sound or alavalier covered by heavy clothes. I suggest a starting value of 2.8 KHz,Q=0.5.

Low-pass useful to reduce garbage sound on high frequencies or to reducequality on dubbed voices and make them more similar to the voices takenon the set. Slope=24 dB/octave, frequency to be evaluated but typicallyaround 13 KHz.

Some brief notes about the way I balance the mix: the dialog is absolutelymy anchor point and I tend not to help it against effects and music until theend of the mix: this way I avoid any build up (an escalation) that would lead toan endless chasing after all the sound components.

After premixing dialog and effects, independently, I listen to the whole andI make some little adjustments. When adding the music I take two steps: onewith short-memory reactions and no limits to the level changes, pursuing thedialog intelligibility and an effective, pleasant and realistic music level, and thesecond making use of the previous automation lane as a visual aid to anticipatethe fader movements and smoothen them, so that the listener can’t perceive thetechnical action.

At the end I check the result with the Director, I make the required modifi-cations and, finally, I improve the dialog intelligibility making little adjustments.A final check after two days helps a lot.

5 Loudness normalization in broadcastLet’s return to the loudness normalization. At the end of 2006 the GTCSembarked on a dialogue with industry experts and representatives of Italian

7

broadcast, including Alessandro Travaglini from Sky. The document that sum-marizes this virtual meeting was published in 2008 and is available on ourwebsite.

We believe it is interesting to read some points, giving intriguing hints ofreflection on the present revolution and on the road to the future:

1. It’s clear to all that the mere fulfilment of a limit on peaks in the broadcast,normally set at 9 dB above the reference level, combined with the useof multiband compressors, valuable to highlight the commercials, hasled to disuniformities losing the correct relationship between differentprograms and between different tv channels. Also, the abuse of aggressivecompression has degraded the overall sound quality and favored low-filistening systems, both those integrated in televisions and the cheaperhome-theater systems.

2. The loudness normalization will certainly discourage the use of compressionand will allow a more natural dynamics within each program. However,while providing a valuable aid in most cases, it doesn’t solve the problemof the right proportions between a program and the others, especially withthe shorter or special ones: imagine an harp solo on a tv channel and amilitary band on another: listening with an hifi system will highlight thatthe same loudness level isn’t the right choice, especially if there is a speakerintroducing the performances: the one presenting the harp concert willsound louder than that presenting the military band. In this regard theGTCS would suggest the inclusion of an appropriate metadata giving theopportunity to the spectator to select a Hi-Fi mode that preserved thecorrect proportion between the levels of the various programs, as DolbyDialnorm do indicating the correct dialog level.

3. The problem of home listening goes beyond: are dynamics and bassextension friends or enemies of the ears? It depends on the listening systememployed (do it goes into saturation?), on the time of the day or night,on the thickness of the walls that separate the listener from neighbors;we have also to ask ourselves about the source of high dynamic sound: isit an unpleasant explosion in a war movie or is it an exciting symphoniccrescendo accompanying our hero to save the princess?The GTCS believes that the right choice about the reduction of thedynamics is to leave the control to the end user so he can enable the fulldynamics, it he wants and can. This will foster an education at good sound,will improve the sale of better quality audio systems and the accumulationof programs better enjoyable in the future, when it will be easier to have amuch higher average sound quality in the homes.

4. Although economically challenging, the GTCS hopes that (at least forprograms of particular quality) is taken into account, after the creation ofan high quality main mix, to perform another mix optimized for listeningsystems of reduced quality, made from the same sound mixer, ensuringbetter results than using automatic systems. This will be possible at a costthat today we estimate no more than 2000 euros per film. To broadcastin parallel the hifi and lofi version would signify to have the best of bothworld. Something similar is happening with some audiophile labels that

8

Hig

h-F

idel

ity

appr

oach

!P

eak

norm

aliz

atio

n(w

e ha

ve a

n up

per

lim

it)

Lou

dnes

s no

rmal

izat

ion

(the

upp

er li

mit

now

is

muc

h hi

gher

but

we

are

impo

sed

to b

e sa

me

wei

ght)

Pea

k no

rmal

izat

ion

wit

h li

mit

ing

(we

appl

y so

me

tric

ks

like

com

pres

sion

)

Sam

e he

ight

Sam

e w

eigh

t, id

eal f

or lo

-fi

and

casu

al li

sten

ers

Rea

l pro

port

ions

, obt

aine

d ad

ding

a m

etad

ata

like

Dol

by

"dia

log

leve

l" or

"mix

ing

leve

l"

Sam

e he

ight

wit

h so

me

smar

t ada

ptat

ion

The

GT

CS

prop

osal

:A

ddin

g a

way

to

perm

it th

e lis

tene

r to

mak

e hi

s ow

n ch

oice

be

twee

n lo

udne

ss

norm

aliz

atio

n an

d tr

ue-s

cale

hi

fi ap

proa

ch.

Figure 4: A funny infographics about level normalization.

9

sells the same album as studio master, cd master, mp3. In this scenario,it would be useful to study and officially define a typical home listeningsystem, to be updated every 5 years, giving homogeneity around the world.For those who can rely on my experience, I had the opportunity to achievethis double version a few years ago with the very challenging Paolo Poeti’s“Pompei” and I can assure you that this practice carries very sensitiveimprovements to tv set listeners. I can also confirm that it is a procedureyou cannot leave entirely to automatism.

5. It is true that in evaluating the loudness of a program it is good todistinguish between signal in the foreground, narratively significant andindispensable, by those parts not contributing to the feeling of volume.This justifies the presence of a gate on the loudness measure integratedover the entire duration of the programme. The threshold for this gatehas initially been set by EBU to -8 dB relative to the loudness measuredwithout gate (ITU has choosen -10 dB).However, we observe that as in any computation that makes use of a stepparameter, the result becomes a little unstable and it’s easy to demonstratespecial cases where it happens that, paradoxically, increasing the level ofbackgrounds beyond the threshold of the gate, the loudness becomes lowerinstead of greater. In my opinion we should avoid it in the future, replacingthe gate with a smoother weighting curve or applying stronger statisticsto better identifying the perceptual barycenter of audio programs.

6 ConclusionI hope you’ve appreciated the effort I’ve made to address you in English, alanguage which I’ve only been studying for a year. And, I do beg you notto ask me direct questions now but to send them to my email address ([email protected]) and I’ll answer you in a few hours.

Thanks to all of you and have a good stay in Rome!

10