correlating sound quality metrics and jury ratings

8/3/2019 Correlating Sound Quality Metrics and Jury Ratings

1/3www.SandV.com2 SOUND & VIBRATION/SEPTEMBER 2008

Correlating Sound Quality Metricsand Jury Ratings

An investigation was conducted into how a large group o

sound quality metrics might be used to predict user reactions tosounds rom a particular type o product, as expressed in termso product-specifc attributes such as ratings o acceptabilityand perceived quality o the product. We assume that a jurystudy has already been conducted or such attributes, producingrating values or various product sounds, and that the objectiveis to determine which metrics or combinations o metrics can bestbe used to predict user judgments or the sounds o dierent ver-sions o the product. The basic methodology employs the use oprincipal components analysis to group the large number o soundquality metrics into just a ew orthogonal (principal) componentsor actors composed o a weighted sum o the original metrics. Ametrics profle is then computed or each sound based on theresulting principal components, ollowed by the creation o atransormation matrix to convert between mean jury ratings and

the metrics profle. The expected perormance o this transor-mation in predicting jury ratings rom the metrics profle is thenassessed. The procedure is illustrated using an example drawnrom yard maintenance equipment.

This article describes an investigation that was conducted intohow sound quality (SQ) metrics might be used to predict user reac-tions to product sounds, where such user reactions are expressedin terms o judgments or ratings on product-specic attributessuch as acceptability o the sound, or perceived quality oroverall eectiveness o the product itsel based on its sound.We assume that at least one jury study has already been conductedon the product class o interest, producing attribute ratings or di-erent versions o the product. The objective then is to determinewhether various metrics or weighted combinations o metrics canbe used to predict user ratings or the sounds o similar products,avoiding the need to reconvene separate jury studies or eachproduct iteration.

The basic methodology that we have investigated in an at-tempt to meet this objective involves using principal componentsanalysis (PCA) to group a large number o SQ metrics into justa ew orthogonal (principal) components or actors, where suchcomponents are composed o a weighted sum o the (standard-ized) original metrics. A metrics prole (MP) is next computedor each sound based on the rst ew principal components (PCs),ollowed by the creation o a transormation matrix to convertbetween mean jury ratings and the MPs.

Principal Components Analysis

PCA and the related method o common-actor analysis (CFA)are oten used to determine i a large number o observed variablescan be accounted or in terms o a smaller number o inerredundamental actors.1 In our case, PCA was used to transorma large set o metrics into a smaller set o linear combinations othese metrics based on the values o the metrics calculated overa large set o sounds originating rom a particular product class.The resulting combinations are the principal components (PCs).This new set o variables accounts or most o the total observedvariance, with each combination being orthogonal to the others,meaning that there is no redundant inormation rom one PC to thenext. The PCs as a whole orm an orthogonal basis or the spaceo the data, and the rst PC is a single axis in this space. Wheneach observation is projected onto this axis, the resulting values

orm a new variable containing the maximum variance among all

possible choices o the rst axis. The second PC is another axisin this space, perpendicular to the rst, and the variance o thisparticular variable is the maximum among all possible choices othis second axis, and so on. Usually the rst ew PCs will accountor a large portion o the total variance, and it is this reduction thatmakes PCA and CFA attractive.

Example Set o Sound Quality MetricsA total o 25 dierent metrics was calculated on a number o

product sounds that had been presented to jurors in a previousjury study involving yard maintenance equipment. These metricsare summarized in Table 1. As the characteristics o the productchange, we expect some o the metric values to change in a sig-nicant way, but not others. Additional metrics could be addedto this list, but the ones in Table 1 will be used to illustrate the

techniques employed here.The rst 17 metrics in Table 1 are airly standard ones and

were calculated using routines as implemented in a system sup-plied by LMS. Metrics 18-25 are customized metrics that we havedeveloped and used in the past.2 These customized metrics relateto spectral balance (high vs. low requency content), tonality,and modulation. A brie description o each o these particularmetrics is given below.

Metric 18 (spectral rotation) represents the balance o highrequencies relative to low requencies, with an A-weighted lterspectrum taken as neutral. This is done by determining the de-gree o pivoting o the A-weighting curve needed or minimizingthe dierence between the original A-weighted, 1/3-octave band

Table 1. Description o metrics calculated on sounds used inconsumer jury study.

Metric Name / Description Units

1 Linear SPL dB, re 20 pa

2 A-weighted SPL dB, re 20 pa

3 B-weighted SPL dB, re 20 pa

4 Zwicker loudness (ree feld) Sones

5 Roughness Asper

6 Sharpness (ree feld) Acum

7 Fluctuation strength Vacils

8 ANSI speech intererence level dB

9 Open articulation index %

10 Kurtosis Unitless

11 Pitch unit Hz

12 Pitch value Pa

13 Tonality Ratio

14 Impulse occurrence rate Hz

15 Impulse duration msec

16 Impulse peak level dB, re thres.

17 Impulse rise rate dB/msec

18 Rotation o A-weighted 1/3-octave band dBspectrum about 1000 Hz

19 Spectral roughness )avg. deviation rom dBrotated A-weighted spectrum)

20 Low-requency, slow-modulation index %

21 Low-requency, ast-modulation index %

22 Mid-requency, slow-modulation index %

23 Mid-requency, ast-modulation index %

24 High-requency, slow-modulation index %25 High-requency, ast-modulation index %

David L. Bowen, Acentech Incorporated, Cambridge, Massachusetts

Based on a paper presented at NOISE-CON 08, Institute o Noise ControlEngineering, Dearborn. MI, July 2008.


2/3www.SandV.com SOUND & VIBRATION/SEPTEMBER 2008 3

to the rms amplitude o the original envelope obtained rom theltered sound pressure signal. The modulation metrics are thenormed by expressing these indices in terms o percent.

Generation o PC-Based Metrics ProflePrior to calculating the principal components weights, the

values o the sound quality metrics computed or each sound arerst standardized (centered with zero mean and normalized toa standard deviation o 1). This step is needed because the metricsmay have completely dierent units o measure rom each other.Figure 1 shows the relative contributions o each o the principal

components calculated rom the standardized matrix ormed romthe 25 metrics described in Table 1 and computed on each o 32dierent sounds that had been previously presented to a jury oconsumers.

These sounds consisted o variations on the sound o a particularproduct targeted or yard maintenance, most o which were createdby altering the sounds o the dierent sources and mechanismswithin the device (the sounds o our extra existing products inthis class were also included in this set). The inormation in Figure1 indicates that the rst our PCs explain about 85% o the observedvariance in the SQ metric values. These our PCs were retained torepresent the metrics and were then rotated (using a varimax rota-tion) and sorted using a modied orm o actor analysis to producethe weighted groupings o metrics shown in Table 2.

The PC weightings provide guidance as to what each o the PCs

in the reduced set primarily denotes. For example, reerring backto the metric descriptions in Table 1, the weightings or each othe PCs in Table 2 appear to group the metrics into what couldroughly be translated as:

Loudness related metrics such as loudness and the overallSPLs as well as AI and speech intererence level.Modulation related metrics such as the mid-requency slowmodulation index, fuctuation strength, etc.Tonality related metrics such as tonality, pitch, and the spec-tral roughness index.Impulsiveness/peakiness related metrics such as impulse peaklevel, impulse rise rate, Kurtosis, etc.Using dierent numbers o PCs will, o course, produce dier-

ent groupings, and it is oten revealing to try out using dierent

spectrum and an amplitude-shited and rotated version o theA-weighting curve itsel, while matching the overall A-weightedlevel. A positive rotation (rota tion o the A-weighting curve in thecounter-clockwise direction) corresponds to a relative increase inthe treble end and a reduction in the bass end, while a nega-tive rotation corresponds to the reverse. We have used 1000 Hzas the pivot point or the rotations, and the requency analysisrange includes the 1/3-octave bands rom 400 Hz to 2500 Hz. Thismetric is given in terms o dB per 1/3-octave band.

Metric 19 (spectral roughness) refects the deviation o the actual1/3-octave spectrum rom a smooth spectrum and is a measure

o its spectral irregularity, which is aected primarily by strongtones in the sound. Inormation or determining this metric arisesout o the computations needed or Metric 18 in the orm o leveldierences between the shited and rotated A-weighting actors or1/3-octave bands and the original 1/3-octave band spectrum. Thesedierences are then averaged across the requency bands used inthe evaluation to yield an average value or the spectral deviation.A large value can indicate the presence o strong tones that deviateaway rom the smooth (rotated and shited) A-weighting curve.This metric is expressed in terms o dB. Note that this metric isdierent rom the traditional time-based roughness described byMetric 5. It also represents another way o estimating the tonalityo the sound (other than Metric 13).

The six modulation metrics (Metrics 20-25) are designed torepresent dierent types o modulation that may be present in

the measured signals. Modulation o sounds is a characteristicthat people readily perceive and can be an undesirable acousticcharacteristic o machinery sounds. Slow modulation is perceivedas variations in amplitude, while ast modulation can be perceivedas a futtering or buzz-like char acteristic. To distinguish be-tween these dierent types o modulations, the Hilbert transormis used to compute the amplitude envelopes o the signals, whichare then ltered between 0.5 and 8 Hz to detect slow modula-tion, and between 50 and 90 Hz to detect ast modulation. Inaddition, since product sounds can be quite complicated, theseast and slow modulations are evaluated within three dierentrequency ranges below 400 Hz, between 400 and 2500 Hz, andabove 2500 Hz. A modulation index is then ormed by takingthe ratio o the rms amplitude o the slow or ast envelope signal

Figure 1. Judging importance o the principal components obtained roma set o 25 sound quality metrics calculated or 32 dierent sounds: (a)scree plot o the eigenvalues, (b) cumulative percentage o observed vari-ance represented by the PCs.

100

80

60

40

20

0

Percent

10

5

0

Eigenvalue

5 10 15 20 25

PC Number

1 5 10 15 20 25

PC Number

(b)

(a)Table 2. Rotated and sorted PC weighting actors or the frst our prin-cipal components computed rom a set o 25 SQ metrics calculated or32 sounds.

Metric Number PC1 PC2 PC3 PC4

3 0.402 0.003 0.014 0.123

4 0.392 0.03 0.003 0.004

1 0.362 0.012 0.057 0.223

2 0.354 0.001 0.013 0.086

8 0.33 0.027 0.029 0.118

9 0.323 0.045 0.004 0.123

5 0.255 0.215 0.136 0.165

22 0.123 0.498 0.057 0.118

7 0.14 0.369 0.019 0.087

18 0.061 0.359 0.166 0.27420 0.2 0.27 0.037 0.188

21 0.127 0.247 0.085 0.014

11 0.182 0.216 0.072 0.124

13 0.059 0.113 0.478 0.079

12 0.096 0.14 0.476 0.075

15 0.088 0.267 0.419 0.091

23 0.063 0.15 0.39 0.01

19 0.032 0.3226 0.329 0.087

17 0.029 0.083 0.013 0.359

16 0.009 0.008 0.06 0.347

25 0.081 0.071 0.043 0.32

14 0.023 0.068 0.077 0.3

24 0.082 0.06 0.055 0.295

10 0.069 0.0-44 0.025 0.2916 0.051 0.045 0.164 0.288


3/3www.SandV.com4 SOUND & VIBRATION/SEPTEMBER 2008

o predicting user reactions to the sounds o other products inthis same general class using the same set o weighting actorsgiven by Table 2 to compute the MP scores or these new sounds.Alternatively, PCA could be applied again to orm a new set oweighting actors based on the SQ metrics values computed orthe new set o sounds. However, this latter approach would not berecommended unless the number o new sounds (the number oobservations or the PCA) was comparable to or greater than thenumber o sounds used to orm the original PC weighting actorslike those in Table 2.

ConclusionsAn approach has been described that attempts to establish a linkbetween a set o objective sound quality metrics and subjectiveimpressions o product sounds. The method makes use o princi-pal components analysis to rst reduce a large number o metricsinto a weighted combination o smaller groups o metrics. To dothis, PCA is applied to a large set o metric values calculated ona large set o sounds, all o which are presumed to originate roma general type o product class (vacuum cleaners or ront-loadingwashing machines or lawn tractors, or example). The rst ewPCs are then used to develop a set o weighting actors that areapplied to the metric values to obtain a (reduced dimension) PC-based metrics prole.

A transormation matrix between the resulting MP scores orthese sounds and a set o corresponding jury ratings on particular

attributes or these same sounds can then be calculated and evalu-ated in terms o its ability to predict the jury ratings. A satisactorytransormation can thereore allow physical measurements osounds rom dierent products or product versions within a prod-uct class (made as changes are made to the product) to reasonablypredict the eect o these changes on perceived SQ without theneed to conduct repeated jury studies.

Applying the technique to a set o 25 metrics calculated on thesounds rom 32 dierent variations o a particular type o yardmaintenance equipment resulted in a regression coecient o0.47 when used to predict attribute-rating values obtained rom aconsumer jury that was exposed to these same sounds. The nextstep would be to assess the accuracy o the ratings predicted ithis same transormation were then applied to a new set o soundsobtained rom this same product class.

Future directions in this area include investigating the possibleuse o alternate statistical techniques that are somewhat related toprincipal components analysis, such as the regression techniques oprincipal components regression and partial least-squares (PLS) re-gression. This later technique may oer a more direct and possiblymore robust way to generate a reduced-order model or predictingSQ ratings rom metric values than the PCA-based metric-prolesapproach described here.

PLS can be thought o as a cross between multiple linear regres-sion and PCA, but unlike PCA, PLS directly considers the observedresponse values (the jury ratings in our case), nding combinationso predictor PCs that have large covariance (a measure o the degreeto which two variables change together) with response values.4 Ingeneral, PLS is more o a predictive technique compared to the

more interpretive technique o PCA. We hope to report on theresults o this and our other on-going work in these areas in thenear uture.

Reerences1. Dillon, W. R., and Goldstein, M., Multivariate Analysis: Methods and

Applications, John Wiley & Sons, New York, 1984.2. Bowen, D. L., and Lyon, R. H., Mapping Perceptual Attributes o Sound

to Product Design Choices, Noise Control Engineering Journal, Vol. 51,No. 4, pp. 271-279, 2003.

3. Lawson, C., and Hanson, R., Solving Least Squares Problems, Prentice-Hall, Englewood Clis, New Jersey, 1974.

4. Rosipal, R., and Kramer, N., Overview and Recent Advances in PartialLeast Squares, in Subspace, Latent Structure and Feature Selection:Statistical and Optimization Perspectives Workshop (SLSFS 2005),Revised Selected Papers (Lecture Notes in Computer Science 3940),Springer-Verlag, pp. 34-51, 2006.

numbers o PCs, while keeping in mind the guiding inormationsuch as that provided by Figure 1.

The weightings or the reduced set o PCs are then used to cal-culate the corresponding scores or each o the sounds underconsideration using the model or each PC:1

where:PCn = resulting score or the nth PC

Y= (standardized) metric values or each o thep metrics w= weights on these variables

For our example, the resulting scores would then consist o ourvalues (rom the our PCs) or each o the 32 sounds. We reer tothese values as a (PC-based) metrics prole (MP) one MP oreach sound. The MPs are then all shited upward so that all aregreater than zero or ease o interpretation.

Transormation rom Metrics Profle to Jury RatingsThe MP scores, along with the mean values o normalized jury

ratings previously obtained or these same sounds, were then usedto calculate a linear transormation matrix X between the MPsand the jury ratings. This was done by solving or X in the generalsystem o equations described by:

where: A = NQ matrix o metrics proile scores or the sounds

(N= number o sounds, Q = number o principal compo-nents retained)

B = NMmatrix o mean jury ratings on Mattributes or thesesame sounds

X = desired QMtransormation matrixSince N(the number o sounds) will generally be greater than Q

(the number o PCs retained), Eq. 2 becomes an over-determined

system o equations (32 equations in our unknowns or our ex-ample). The transormation matrix (or vector i jury ratings are ora single attribute only) X can then be determined in a least-squarederror sense using, or example, singular-value decomposition togenerate a pseudo-inverse o the A matrix.3

Figure 2 summarizes how well the resulting transormationvector in this example was able to predict the jury ratings or theattribute perceived power o the product using the MP valuesderived rom the rst our principal components. In this case, theR2 goodness-o-t indicator was about 47%. The our urthestoutliers in Figure 2 are associated with the our sounds includedin the jury study that were not created by altering the sounds ovarious components in the baseline unit (these extra soundswere the sounds o competitor units, alternate models, etc.). I wedo not include these our outliers, the resulting R2 value increases

to about 88%.This same transormation could now be evaluated in terms The author can be reached at: [email protected].

AX = B (2)

PCn = w(n)1Y1 + w(n)2Y2 + ... + w(n)pYp (1)

Figure 2. Jury ratings on the attribute perceived power (on a normalizedscale o 0-100) or the sounds o 32 dierent versions o a piece o yardmaintenance equipment, vs. ratings predicted by multiplying an SQ metricsprofle derived rom these same sounds by a transormation vector.

R2 = 0.47

10 20 30 40 50 60 70 80 9010

20

30

40

50

60

70

80

90

Projected

RatingsUsing4PCsforMetricsProfile

Original Consumer Ratings of Power (means, taken over 22 jurors and 3 runs)

correlating sound quality metrics and jury ratings

Documents