cepstral peak prominence-based phonation stabilisation time as an indicator of voice disorder
TRANSCRIPT
![Page 1: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/1.jpg)
Cepstral Peak Prominence-Based Phonation Stabilisation Time as an indicator of Voice Disorder
Stephen Jannetts and Felix Schaeffler
31/08/201511th Pan-European Voice Conference
![Page 2: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/2.jpg)
Voice in connected speech Requires
Initiation of phonationMaintenance of phonationTermination of phonation
… in quick succession and at specific points in time
![Page 3: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/3.jpg)
Voice in connected speech
Initiation of phonationMaintenance of phonationTermination of phonation
Voice Problems
![Page 4: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/4.jpg)
Voice in connected speech
Initiation of phonationMaintenance of phonationTermination of phonation
(Gordon & Ladefoged, 2001)
Voice Problems
![Page 5: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/5.jpg)
Clinical acoustic assessment Focused on phonation maintenance Uses sustained vowels to exclude confounds Initial and final portions of the vowel are excluded
→ Phonation initiation and termination are not taken into account
![Page 6: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/6.jpg)
Clinical acoustic assessment This approach has been criticised for poor validity (e.g. Takahashi & Koike, 1976;
Hammarberg, et al. 1980; Askenfelt & Hammarberg, 1986; Maryn et al. 2010; Maryn & Roy, 2012; Choi et al. 2012)
Complex transitions in connected speech could be a rich source of clinical information
Mechanical consequences of inflammation or tension could be most evident at voice onset
Initiation/Termination rarely differentiated even when connected speech used (See e.g. Vocal Rise Time, p. 129 Baken & Orlikoff 2000)
![Page 7: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/7.jpg)
Phonation Stabilisation Time Acoustic approach to phonation initiation
Uses connected speech
Does not require manual segmentation
![Page 8: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/8.jpg)
PST based on autocorrelation
Schaeffler, et al 2015 - http://www.icphs2015.info/pdfs/Papers/ICPHS0331.pdf
Onset of voicingStable
periodicity threshold
Time (s) .91 Stable
periodicity threshold
.45 voicing threshold
Autocorrelation coefficient
![Page 9: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/9.jpg)
PST based on CPP
Onset of voicingStable
periodicity threshold
Time (s) 23.14dB
Stable periodicity threshold
(.45) voicing threshold
Cepstral Peak Prominence
![Page 10: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/10.jpg)
PST is a duration
PST
![Page 11: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/11.jpg)
PST based on CPP – keeping things robust
Onset of voicingStable
periodicity threshold
Time (s) 23.14dB
Stable periodicity threshold
(.45) voicing threshold
Cepstral Peak Prominence
70ms
![Page 12: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/12.jpg)
PST - Research QuestionsQ1 - Can PST differentiate normal and disordered voices?
Q2 – Can PST detect cases that are below pathological thresholds for sustained vowels?
![Page 13: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/13.jpg)
Material KayPENTAX Disordered Voice Database
Sustained vowel: stable portion of sustained [a], including 22 MDVP parameters (shimmer, jitter etc)
Connected speech: 12s section of the ‘rainbow passage’
Voices are categorised as ‘normal’ and ‘pathological’
![Page 14: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/14.jpg)
All samples
Normal Disordered TotalFemale 31 191 220Male 21 121 142Total 52 312 364
![Page 15: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/15.jpg)
Samples below threshold
Normal Disordered TotalFemale 30 (31) 20 (191) 50Male 15 (21) 17 (121) 32Total 45 37 82
![Page 16: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/16.jpg)
Procedure PST calculated using CPP
Variables- PST mean per sample (PST M)- PST standard deviation per sample (PST SD)- Percentage of voiced segments that reached criterion (Seg%)
![Page 17: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/17.jpg)
Results: all voices Female disordered voices: significantly longer duration PST M (U=728.5, p<0.001) significantly larger SD of PST (U=502, p<0.001) Seg% significantly lower (U=557, p<0.001)
Male disordered voices: significantly longer duration PST M (U=221.5, p<0.001) significantly larger SD of PST (U=200, p<0.001) Seg% significantly lower (U=140, p<0.001)
![Page 18: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/18.jpg)
Results: below threshold Normal DisorderedPST M 50.84(11.39)* 70.28(27.15)*
PST SD 23.49(7.89)** 52.13(28.45)**
Seg% 93.57(6.31)* 79.02(23.68)*
Means (SD) for female ‘below threshold’ Normal DisorderedPST M 45.88(9.52)* 77.86(33.03)*
PST SD 26.94(9.99)* 49.31(25.15)*
Seg% 95.58(4.35)** 76.63(22.05)**
Means (SD) for male ‘below threshold’
* - p < 0.005** - p < 0.001
![Page 19: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/19.jpg)
Mean PST
0 20 40 60 80 100 120 140 160 180
PST
SD
0
20
40
60
80
100
120
Disordered Normal
Male below threshold
![Page 20: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/20.jpg)
Mean PST
0 20 40 60 80 100 120 140 160 180
PST
SD
0
20
40
60
80
100
120
Disordered Normal
Male below threshold
![Page 21: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/21.jpg)
Mean PST
20 40 60 80 100 120 140
PST
SD
0
20
40
60
80
100
120
Disordered Normal
Female below threshold
![Page 22: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/22.jpg)
Mean PST
20 40 60 80 100 120 140
PST
SD
0
20
40
60
80
100
120
Disordered Normal
Female below threshold
![Page 23: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/23.jpg)
Hypothesis confirmed PST was significantly longer in all disordered voice groups PST is a potentially useful parameter for the analysis of disordered
voices Even for voices without pathological findings in sustained vowels
Maybe particularly relevant for mild/early stage voice disorders? Or a certain type?
![Page 24: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/24.jpg)
Remaining questions and work to be done Categorisation and diagnostic labelling - is PST more useful for a specific voice disorder or symptom?
Segmental context?
Algorithm tweaking and streamlining the process.
![Page 25: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/25.jpg)
Thank you!
![Page 26: Cepstral Peak Prominence-Based Phonation Stabilisation Time as an Indicator of Voice Disorder](https://reader035.vdocuments.net/reader035/viewer/2022070515/587a258f1a28abbd388b4fcf/html5/thumbnails/26.jpg)
ReferencesGordon, M. & Ladefoged, P. (2001). Phonation types: a cross-linguistic overview. Journal of Phonetics, 29(4), 383–406.Maryn, Y., Roy, N., De Bodt, M., Van Cauwenberge, P. & Corthals, P. (2009). Acoustic measurement of overall voice quality: a meta-analysis. The Journal of the Acoustical Society of America, 126(5), 2619–34.Askenfelt, A.G. & Hammarberg, B. 1986. Speech waveform perturbation analysis: a perceptualacoustical comparison of seven measures. Journal of Speech and Hearing Research, 29(1), 50–64.Choi, S.H. et al. 2012. The effect of segment selection on acoustic analysis. Journal of Voice, 26(1), 1–7.Hammarberg, B. et al. 1980. Perceptual and acoustic correlates of abnormal voice qualities. Acta OtoLaryngologica, 90(5-6), 441–51.Maryn, Y. & Roy, N., 2012. Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity. Jornal da Sociedade Brasileira de Fonoaudiologia, 24(2), 107– 12.Maryn, Y. et al., 2010. Toward improved ecological validity in the acoustic measurement of overall voice quality: combining continuous speech and sustained vowels. Journal of Voice, 24(5), 540–55.Takahashi, H. & Koike, Y., 1976. Some perceptual dimensions and acoustical correlates of pathologic voices. Acta Oto-Laryngologica. Supplementum, 338, 1–24.Baken, R. J., & Orlikoff, R. F. 2000. Clinical Measurement of Speech and Voice. San Diego: Singular Publishing Group.Schaeffler, F., Jannetts, S & Beck, J., 2015. Phonation Stabilisation Time as an Indicator of Voice Disorder. ICPhS [accepted].