philip harrison j p french associates & department of language & linguistic science, york...

Philip HarrisonJ P French Associates &

Department of Language & Linguistic Science,

York University

IAFPA 2006 Annual Conference

Göteborg, Sweden

Variability of Formant Measurements – Part 2

2

Summary

• Briefly recap previous analysis & last year’s presentation

• New analysis & results

• PhD research

• Questions

3

Study

• Aim: Investigate the variability of formant measurements which exists both within and between different software programs currently used in the field of forensic phonetics.– 3 programs – Praat, Multispeech & Wavesurfer

– 3 analysis parameters – LPC order, analysis (frame/window) width, pre-emphasis

– Word list – 5 vowel categories – 6 tokens per category – read 3 times – total = 90 tokens

– 2 speakers – Peter French & me

– 2 simultaneous recordings – microphone & telephone

4

Results & Analysis

• Scripts used to obtain 37,260 individual formant measurements using LPC formant trackers

• Analysis – microphone data only– Initial observations of raw formant data

– Quantitative analysis of results

– Statistical analysis

5

My F1s from PraatLPC Variation

0

500

1000

1500

2000

2500

3000

3500

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88

Token

F1

Fre

qu

ency

(H

z) 6

8

10

12

14

16

18

FLEECE TRAP PALM GOOSE SCHWA

6

The Plot Shows…

• Scripts work – (used in fault finding)

• Vowel categories clear

• Greatest deviation – LPC orders 6 & 8

• Orders 10 to 18 very similar for FLEECE, GOOSE & SCHWA

• Generated many more plots for all formants, parameters & software – Lots of variation

– Difficult to interpret

7

Quantitative Analysis

• Quantitative Difference Analysis– No absolute measurement to compare

formants with – outcome of analysis, not directly comparable with acoustic reality

– Difference calculated between value obtained with default analysis settings

– Absolute difference calculated for each formant then averaged by vowel category

– Shows variation between two analyses

8

Observations

• Numerical analysis confirmed impression from plots

• Clear differences between vowel categories, speakers, formants, software & settings

• Complex set of results with no clear patterns

9

Statistical Analysis

• Paired t-test between measurements from default settings and varied settings for each vowel category– Null hypothesis – altering analysis settings no effect

– Exp hypothesis – altering analysis settings effect

• Number of significant ‘hits’ summed – max 15

• Higher number = greater variation in formant measurements

• 2 significance levels – 0.01 & 0.05

10

Conclusions

• Hoped to have clear patterns, able to produce set of guidelines/recommendations

• Patterns only at specific, detailed level

• Very clear that many factors affect formant measurements

• No software is obviously better than others

• Care should be taken when measuring formants

11

New Work!!!

• Initial data contained obviously incorrect measurements

• Discard measurements – criterion?

• Determine acceptable band– Spectrograms – no

– Formant bandwidths – no (attempted)

– LPC tracker & spectrogram – no (attempted)

– Spectrum of selection – yes but still encountered problems

• Band limit 300 Hz – impressionistic

12

Spectrum Measurements

• Used to determine centre of 300 Hz acceptable band

• Spectrum with 260 Hz bandwidth – same as default spectrogram

• Measured peaks F1, F2 & F3

• Issues/problems– Windowed -> biased to centre of selection

– Formant peaks not always clear – some tokens ignored

– Double peaks – highest peak measured

13

Analysis of Accepted Measurements

• Analyse LPC variation only – other parameters more stable – not altered

• No accurate reference which raw measurements can be judged against

• Accepted results provide indication of accuracy & consistency

• Clear patterns in accepted formants

• Condense results – % accepted per vowel category

14

Plot of Accepted ResultsPraat Me Mic F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

LPC

Per

cen

tag

e A

ccep

ted

FLEECE

TRAP

PALM

GOOSE

SCHWA

15

Me Microphone AcceptedP r aat M e M i c F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h M e M i c F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

Waves ur f er M e M i c F1

0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

P r aat M e M i c F2

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18P r aat M e M i c F3

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

20

40

60

80

100

120

10 11 12 13 14 15 16 17 18

Praat Multispeech Wavesurfer

F1

F2

F3

16

Me Telephone AcceptedPraat Multispeech Wavesurfer

F1

F2

F3

P r aat M e P hone F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h M e P hone F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

Waves ur f er M e P hone F1

0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

P r aat M e P hone F2

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18P r aat M e P hone F3

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

17

JPF Microphone AcceptedPraat Multispeech Wavesurfer

F1

F2

F3

P r aat J P F M i c F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h J P F M i c F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

Waves ur f er J P F M i c F1

0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

P r aat J P F M i c F2

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h J P F M i c F2

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18P r aat J P F M i c F3

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

18

JPF Telephone AcceptedPraat Multispeech Wavesurfer

F1

F2

F3

P r aat J P F P hone F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

M ul ti s peec h J P F P hone F1

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18

Waves ur f er J P F P hone F1

0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

P r aat J P F P hone F2

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18P r aat J P F P hone F3

0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

6 8 10 12 14 16 18


0

10

20

30

40

50

60

70

80

90

100

10 11 12 13 14 15 16 17 18

19

General Patterns• Praat & Multispeech – bell curves

– Most consistent setting – P 10, MS 10 to 14

– Curves shifted to left (lower LPC) for phone

• Wavesurfer – horizontal– Different behaviour to Praat & Multispeech

– Some very weak results – especially F3

– For me better results for phone recording (also true for Praat & Multispeech)

• Most consistent setting Praat LPC 10

• Again variation across vowel category, speaker, formant, software & condition

20

Microphone vs Telephone

• Künzel (2001):– Landline phone vs microphone

– Largest F1 difference in region of 14% for close vowels

• Byrne & Foulkes (2004):– GSM mobile phone vs microphone

– F1 average 29% higher for GSM

• Not big differences for F2 & F3

• Current data (spectral comparisons) – only 2 speakers

21

Comparison Tables

Me

JPF

F1 F1 % Diff F2 F2 % Diff F3 F3 % DiffFLEECE 258 26 2171 0 2891 0TRAP 771 0 1394 1 2632 -1PALM 690 6 1125 -1 2626 -2GOOSE 260 33 1748 0 2242 0SCHWA 502 0 1486 1 2513 -1

F1 F1 % Diff F2 F2 % Diff F3 F3 % DiffFLEECE 254 13 2140 0 2551 0TRAP 661 2 1413 -1 2306 0PALM 607 6 1037 -1 2439 0GOOSE 269 11 1105 -1 2222 0SCHWA 528 1 1330 0 2274 0

22

General Observations

• LPC tracks for phone recordings more stable, easier to measure– Less ‘information’ above F3

– Possibly pre-filter recordings?

• Different LPC orders produce better tracks for different formants of the same token– Contradicts my previous advice to keep LPC

setting constant across vowel categories

23

PhD Next Steps

• Use synthesised speech

• Formant values specified

• Repeat software experiments

• Other factors to investigate– Pitch

– Voice quality

– Interaction of analysis parameters

24

Other Potential Areas of Investigation for PhD

• Effects of GSM coding & transmission

• Acoustic environments

• Pseudo-formants – source???

• Mouth/telephone distance & orientation

• Any other ideas…?

25

Questions

?

Thanks to Peter French & Paul Foulkes

philip harrison j p french associates & department of language & linguistic science, york...

Documents

previous analysis

outcome of analysis

results analysis scripts

analysis framewindow

analysis parameters

spectrum measurements

formant bandwidths

hz impressionistic slide