27/08/03 VOQUAL ‘03 1
Variation of glottal LF parameters across F0, vowels
and phonetic environment
Michelle Tooher & John McKenna
School of Computing, Dublin City University
27/08/03 VOQUAL ‘03 2
Context
• Machine learn characteristics of a
speaker
– Given utterance info. (prosodic, contextual
and info about individual utterance)
– Predict LF parameters of glottal flow for
speaker and utterance
27/08/03 VOQUAL ‘03 3
Data
• 2 male speakers
• 3 vowels - /a/ /i/ /u/
• 4 contexts - /s_t/ /s_d/ /z_t/ /z_d/
• “Say __ again”
• 7 pitches : 90 – 210
• Randomly presented
• 3 sets
27/08/03 VOQUAL ‘03 4
Analysis & Fitting
• Kalman-Filter based LP (McKenna, ‘99)
– Chooses closed phase sections
– Performs closed phase covariance LP
– DGF
• LF fitting (Fant et. al., ‘85 )
– LF parameters : , , , aTptctet
27/08/03 VOQUAL ‘03 5
LF model
27/08/03 VOQUAL ‘03 6
Questions
• Does glottal flow vary w.r.t. utterance and
speaker?
• Any distinct patterns?
• What influences these variations/patterns?
• Should they be taken into consideration?
• Speaker specific?
27/08/03 VOQUAL ‘03 7
Data Analysis
• LF parameters from beginning, middle,
end of each vowel
• Statistical analysis (SPSS)
• Data plots
27/08/03 VOQUAL ‘03 8
Data Analysis
• SPSS – correlation analysis*
* Pearson Correlation Coefficients
-
- .757 .902 .964 .446
- - .949 .859 -.065
- - - .961 .132
- - - - .364
- - - - -
0T
0T
pt et
ct
aT
pt
et
ct aT -
- .812 .929 .962 .500
- - .936 .892 .127
- - - .973 .267
- - - - .452
- - - - -
0T pt
pt0T
et
et
ct
ct aT
aT
Speaker 1 Speaker 2
27/08/03 VOQUAL ‘03 9
Results
• Variations w.r.t. :
– , , and rise
• and close to linear whereas portrays
nonlinearity
– - little variation
0T
ct et
aT
pt
ct et pt
27/08/03 VOQUAL ‘03 10
ResultsSPEAKER: .00
T0
.012.011.010.009.008.007.006.005.004
TC
.011
.010
.009
.008
.007
.006
.005
.004
.003
27/08/03 VOQUAL ‘03 11
Results
27/08/03 VOQUAL ‘03 12
Results (cont.)
• Variations w.r.t. vowels:
– LF parameter values of /a/
higher than /i/, and /u/
– Linear regression shows
significant differences in
both slope and y-intercept
between /a/ and /i/ or /u/
27/08/03 VOQUAL ‘03 13
Results (cont.)
• Variations w.r.t. environment:
– Both linear regession and data plots show the
following:
• /z/ preceeding – affects parameters
• /s/ preceeding – no apparent effects
• Context following vowel has no effect on
parameters
• Voiced and voiceless pairs /s_t/, /z_d/ - no effect
27/08/03 VOQUAL ‘03 14
Results (cont.)
• Variations in waveshape parameters
– and change with
– varies little
– As rises also rises
• Variations w.r.t speaker
– Same patterns across two speakers
– S2 values are generally higher than S1
0T dR
aR kR
gR0T
27/08/03 VOQUAL ‘03 15
Conclusions
• Patterns do exist
• LF parameters vary with
• and appear to vary linearly with whereas and
appear to vary non-linearly
• Only the voiced context appears to have an affect on
the parameters and only when it preceeds the
parameters
0T
pt aTct et 0T
27/08/03 VOQUAL ‘03 16
Conclusions
• Vowel influences values of parameters
• Variations with are not speaker specfic, however
values of the LF parameters are
• More data across speakers, contexts and vowels is
needed for a more exhaustive study
0T
27/08/03 VOQUAL ‘03 17
Question
• How could this affect synthesis?
– F0 manipulation – parameters need to be adjusted
but possibly at different rates
– Original and target environments
– New speakers – can a new speaker be created by
varying the levels (y-intercept) of the parameters?
27/08/03 VOQUAL ‘03 18
• Additional data plots follow
27/08/03 VOQUAL ‘03 19
27/08/03 VOQUAL ‘03 20
27/08/03 VOQUAL ‘03 21
SPEAKER: .00
T0
.012.011.010.009.008.007.006.005.004
TP.006
.005
.004
.003
.002
.001
27/08/03 VOQUAL ‘03 22
SPEAKER: .00
T0
.012.011.010.009.008.007.006.005.004
TE.010
.009
.008
.007
.006
.005
.004
.003
.002
27/08/03 VOQUAL ‘03 23
SPEAKER: .00
T0
.012.011.010.009.008.007.006.005.004
TC
.011
.010
.009
.008
.007
.006
.005
.004
.003
27/08/03 VOQUAL ‘03 24
SPEAKER: 1.00
T0
.012.011.010.009.008.007.006.005.004
TA
.003
.002
.001
0.000
27/08/03 VOQUAL ‘03 25
SPEAKER: 1.00
T0
.012.011.010.009.008.007.006.005.004
RG
6
5
4
3
2
1
0
27/08/03 VOQUAL ‘03 26
SPEAKER: 1.00
T0
.012.011.010.009.008.007.006.005.004
RD
3.0
2.5
2.0
1.5
1.0
.5
0.0