voicing and voice assimilation in russian stops

212
University of Iowa Iowa Research Online eses and Dissertations Summer 2012 Voicing and voice assimilation in Russian stops Vladimir Kulikov University of Iowa Copyright 2012 Vladimir Kulikov is dissertation is available at Iowa Research Online: hps://ir.uiowa.edu/etd/3327 Follow this and additional works at: hps://ir.uiowa.edu/etd Part of the Linguistics Commons Recommended Citation Kulikov, Vladimir. "Voicing and voice assimilation in Russian stops." PhD (Doctor of Philosophy) thesis, University of Iowa, 2012. hps://doi.org/10.17077/etd.r6ib0d07.

Upload: others

Post on 04-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Voicing and voice assimilation in Russian stops

University of IowaIowa Research Online

Theses and Dissertations

Summer 2012

Voicing and voice assimilation in Russian stopsVladimir KulikovUniversity of Iowa

Copyright 2012 Vladimir Kulikov

This dissertation is available at Iowa Research Online: https://ir.uiowa.edu/etd/3327

Follow this and additional works at: https://ir.uiowa.edu/etd

Part of the Linguistics Commons

Recommended CitationKulikov, Vladimir. "Voicing and voice assimilation in Russian stops." PhD (Doctor of Philosophy) thesis, University of Iowa, 2012.https://doi.org/10.17077/etd.r6ib0d07.

Page 2: Voicing and voice assimilation in Russian stops

VOICING AND VOICE ASSIMILATION IN RUSSIAN STOPS

by

Vladimir Kulikov

An Abstract

Of a thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy

degree in Linguistics in the Graduate College of The University of Iowa

July 2012

Thesis Supervisor: Professor Catherine O. Ringen

Page 3: Voicing and voice assimilation in Russian stops

1

ABSTRACT

The main objective of this thesis is to investigate acoustic cues for the voicing

contrast in stops in Russian for effects of speaking rate and phonetic environment.

Although the laryngeal contrast in Russian is assumed to be a [voice] contrast, very few

experimental studies have looked at the acoustic properties of Russian voiced and

voiceless stops. Most claims about acoustic properties of stops and phonological

processes that affect them (voice assimilation and final devoicing) have been made based

on impressionistic transcriptions. The present study provides evidence that (1) voicing in

voiced stops is affected by speaking rate manipulation, (2) stops in Russian retain

underlying voicing specification in presonorant position and voice assimilation occurs

only in obstruent clusters, and (3) phonological processes of voice assimilation and final

devoicing do not result in complete neutralization.

The target of the investigation is voiced and voiceless intervocalic stops, stops in

clusters, and final stops in different prosodic positions within a word and at the phrase

level. The acoustic cues to voicing (duration of voicing, stop closure duration, vowel

duration, f0, and F1) were measured from the production data of 14 monolingual

speakers of Russian recorded in Russia. Speakers produced words and phrases with target

stops in three speaking rate conditions: list reading, slow rate and fast rate. The data were

analyzed in 5 blocks focusing on (1) word-internal stops, (2) voice assimilation in stops

in prepositions, (3) cases of so-called “sonorant transparency”, (4) voice assimilation in

stops before /v/, and (5) voicing processes across a word boundary.

The results of the study present a challenge to the widely-held assumption that

phonological processes precede phonetic processes at the phonology-phonetics interface.

It is shown that the underlying contrast leaves traces on assimilated and devoiced stops.

To account for the findings, a phonology-phonetics interface that allows interaction

between the modules is required. In addition, the results show that temporal cues are

Page 4: Voicing and voice assimilation in Russian stops

2

affected by speaking rate manipulation, but the effect of rate on voicing is found only in

voiced stops. Duration of voicing and VOT in voiceless stops are not affected by

speaking rate. The results also show that no effect of C2 is obtained on voicing in C1

stops in obstruent-sonorant-obstruent clusters, thus no “phonological sonorant

transparency to voice assimilation” is found in Russian. Rather, the study provides

evidence that there is variation in production of voicing in stops in prepositions, and that

voice assimilation in stops before /v/ followed by a voiced obstruent is optional for some

speakers.

Abstract Approved: _____________________________________________________

Thesis Supervisor

_____________________________________________________

Title and Department

_____________________________________________________

Date

Page 5: Voicing and voice assimilation in Russian stops

VOICING AND VOICE ASSIMILATION IN RUSSIAN STOPS

by

Vladimir Kulikov

A thesis submitted in partial fulfillment of the requirements for the Doctor of Philosophy degree in Linguistics in

the Graduate College of The University of Iowa

July 2012

Thesis Supervisor: Professor Catherine O. Ringen

Page 6: Voicing and voice assimilation in Russian stops

Graduate College The University of Iowa

Iowa City, Iowa

CERTIFICATE OF APPROVAL

___________________________

PH.D. THESIS

____________

This is to certify that the Ph.D. thesis of

Vladimir Kulikov

has been approved by the Examining Committee for the thesis requirement for the Doctor of Philosophy degree in Linguistics at the July 2012 graduation.

Thesis Committee: Catherine O. Ringen, Thesis Supervisor

Jill N. Beckman

William D. Davies

Bob McMurray

Jerzy Rubach

Page 7: Voicing and voice assimilation in Russian stops

ii

To Olga

Page 8: Voicing and voice assimilation in Russian stops

iii

ACKNOWLEDGEMENTS

I wish to express my deepest gratitude to all who were directly involved in my

dissertation. First of all, I am indebted to my dissertation advisor, Professor Catherine

Ringen, whose insightful comments, constant guidance and support helped me to separate

the wheat from the chaff and made it all possible. I was always impressed by her

generous support for me throughout the thesis-writing process, and with other academic

matters.

Second, I would like to thank the members of my doctoral committee, Professors

Jill Beckman, William Davies, Bob McMurray, and Jerzy Rubach for their invaluable

comments and inspiring insights. In addition to their help with thesis, I also had an honor

and privilege to participate in their classes and seminars. They taught me the thrill and

reality of linguistics and psycholinguistics; and they greatly contributed to the

development of my thinking, reasoning, and teaching in this area. I also thank Professors

Roumyana Slabakova, Elena Gavruseva, Richard Hurtig, and Alice Davison, who

contributed to my linguistic education, and Professors Jim Maxey and Walter Vispoel for

their help with understanding statistics.

I am grateful to the Graduate College of the University of Iowa for their financial

support. This research would not have been possible without the T. Anne Cleary

International Dissertation Research Fellowship (2010) and the Ballard and Seashore

Dissertation Year Fellowship (2011-2012).

I also wish to thank my friends (and former colleagues) from the Institute of

Philology at Tambov State University, Russia, for assistance with recruiting subjects and

other assistance: Professors Oleg Polyakov, Vera Romanova, Anna Daen, and Olga

Davydenkova. And, of course, very special thanks go to the 14 students of Tambov State

University, who showed genuine enthusiasm in performing production tasks for my

experiment.

Page 9: Voicing and voice assimilation in Russian stops

iv

I would also like to express my thanks to my fellow graduate students in the

Linguistics Department: Lalita Dhareshwar, Craig Dresser, Lauren Eby, Ivan Ivanov,

Sangkyun Kang (Danny), Molly Kelly, Eri Kurniawan, Kumyoung Lee, Jeffery Press,

Lindsey Quinn-Wriedt, Tomomasa Sasa, Yulia Skaleva, and Marta Tryzna, with whom I

shared both good and hard times in the past six years.

Finally, I want to thank my wife, Olga, and my daughter, Maria, the true

motivation behind this dissertation, for their patience and love.

Page 10: Voicing and voice assimilation in Russian stops

v

TABLE OF CONTENTS

LIST OF TABLES ............................................................................................................. ix

LIST OF FIGURES ......................................................................................................... xiii

CHAPTER I INTRODUCTION ........................................................................................1

CHAPTER II THE VOICING CONTRAST AND ACOUSTIC CUES FOR VOICING .........................................................................................................3

2.1. Voiced and voiceless stops in Russian ......................................................3 2.1.1. Word-initial and word-medial stops ................................................3 2.1.2. Word-final devoicing .......................................................................4 2.1.3. Voice assimilation ...........................................................................5 2.1.4. Sonorant transparency .....................................................................7 2.1.5. Voicing contrast before /v/ ..............................................................9

2.2. Acoustic cues to voicing ..........................................................................10 2.2.1.Voice onset time (VOT) .................................................................10 2.2.2. F1 transition ...................................................................................13 2.2.3. Fundamental frequency (f0) ..........................................................14 2.2.4. Duration of a preceding vowel ......................................................14 2.2.5. Voicing during closure ..................................................................15

2.3. Voicing contrasts and phonological features ...........................................17 2.4. Voicing contrasts and speaking rate ........................................................19 2.5. Effects of speaking rate and environment on voicing in Russian: Rationale for the study ....................................................................................20

CHAPTER III EXPERIMENT 1: VOICING IN WORD-INTERNAL STOPS ..............24

3.1. Motivation for the study ..........................................................................24 3.2. Method .....................................................................................................27

3.2.1. Participants ....................................................................................27 3.2.2. Stimuli ...........................................................................................27

3.2.2.1. Word-initial stops ................................................................28 3.2.2.2. Word-medial intervocalic stops ..........................................28 3.2.2.3. Word-medial stops in obstruent clusters .............................28 3.2.2.4. Word-final stops ..................................................................29

3.2.3. Procedure and measurements ........................................................29 3.3. Results I: Word-initial stops ....................................................................33

3.3.1. VOT in utterance-initial stops .......................................................33 3.3.2. Initial stops in connected speech ...................................................35

3.3.2.1. Rate and word duration .......................................................36 3.3.2.2. Acoustic cues for voicing in connected speech ...................36 3.3.2.3. Distributions of VOT and voicing during closure in slow and fast speech .........................................................................39

3.3.3. Word-initial stops in connected speech: Examining variability in cues ....................................................................................40

3.4. Results II: Word-medial stops .................................................................44 3.4.1. Effect of underlying voicing ..........................................................46 3.4.2. Effect of speaking rate ...................................................................47

3.4.2.1. Closure duration ..................................................................47 3.4.2.2. Voicing during closure ........................................................47

Page 11: Voicing and voice assimilation in Russian stops

vi

3.4.2.3. VOT .....................................................................................48 3.4.2.4. Preceding vowel ..................................................................48 3.4.2.5. Phonation of surrounding vowels ........................................49

3.4.3. Distributions of voicing during closure and VOT .........................49 3.4.4. Effect of sonorant type ..................................................................51 3.4.5. Word-medial stops: Examining variability in cues .......................52

3.5. Results III: Clusters and voice assimilation .............................................55 3.5.1. Voicing in C2 ................................................................................56 3.5.2. C1 stops .........................................................................................57

3.5.2.1. Effects of underlying and C2 voicing .................................58 3.5.2.2. Effect of speaking rate .........................................................60 3.5.2.3. Distribution of voicing in C1 stops .....................................60

3.5.3. Stops in clusters: Examining variability in cues ............................62 3.6. Results IV: Devoicing in final stops ........................................................65

3.6.1. Effect of underlying voice .............................................................67 3.6.2. Effect of speaking rate ...................................................................68 3.6.3. Distribution of voicing ..................................................................69 3.6.4. Final stops: Examining variability in cues ....................................70

3.7. Effects of speaking rate and environment on voicing duration: Omnibus analysis ............................................................................................72 3.8. Discussion and conclusions .....................................................................74

CHAPTER IV EXPERIMENT 2: VOICE ASSIMILATION IN PREPOSITIONS ........80

4.1. Background ..............................................................................................80 4.2. Participants and stimuli............................................................................81 4.3. Procedure and measurements ..................................................................82 4.4. Results......................................................................................................83

4.4.1. Closure duration ............................................................................83 4.4.2. Duration of voicing ........................................................................84 4.4.3. Voicing Ratio .................................................................................87 4.4.4. Duration of a preceding vowel ......................................................88 4.4.5. F0 and F1 .......................................................................................90

4.5. Discussion and conclusion .......................................................................90

CHAPTER V EXPERIMENT 3: VOICE ASSIMILATION IN OBSTRUENT-SONORANT-OBSTRUENT CLUSTERS: ARGUMENTS AGAINST ‘SONORANT TRANSPARENCY’ ...............................................................94

5.1. Background ..............................................................................................94 5.2. Participants and stimuli............................................................................97 5.3. Procedure and measurements ..................................................................98 5.4. Results....................................................................................................100

5.4.1. Effect of speaking rate .................................................................101 5.4.2. Voicing in C2 obstruents .............................................................102 5.4.3. Closure duration of C1 stops .......................................................102 5.4.4. Voicing duration ..........................................................................103 5.4.5. Voicing Ratio ...............................................................................106 5.4.6. Vowel duration ............................................................................107 5.4.7. F0 and F1 .....................................................................................108 5.4.8. Tokens with ‘transparency’ effect ...............................................109

5.5. Discussion and conclusion .....................................................................112

Page 12: Voicing and voice assimilation in Russian stops

vii

CHAPTER VI EXPERIMENT 4: VOICE ASSIMILATION BEFORE /V/..................116

6.1. Background ............................................................................................116 6.2. Participants and stimuli..........................................................................117 6.3. Procedure and measurements ................................................................117 6.4. Results....................................................................................................117

6.4.1. Rate and word duration ...............................................................118 6.4.2. Closure duration ..........................................................................119 6.4.2. Duration of voicing ......................................................................120 6.4.3. Voicing ratio ................................................................................122 6.4.4. Duration of a preceding vowel ....................................................124 6.4.5. Assimilation: Individual results ...................................................125

6.6. Discussion and conclusion .....................................................................127

CHAPTER VII EXPERIMENT 5: VOICE ASSIMILATION AND FINAL DEVOICING AT PHRASE LEVEL ............................................................129

7.1. Background ............................................................................................129 7.2. Participants ............................................................................................130 7.3. Stimuli....................................................................................................130 7.4. Measurements ........................................................................................132 7.5. Results....................................................................................................134

7.5.1. Segment length and speaking rate ...............................................134 7.5.2. C2 stops .......................................................................................135 7.5.3. Effects of speaking rate and following segment on C1 voicing ...................................................................................................136 7.5.4. Voice assimilation before C2 stops .............................................137

7.5.4.1. Closure duration ................................................................140 7.5.4.2. Voicing duration ................................................................141 7.5.4.3. Duration of a preceding vowel ..........................................141 7.5.4.4. Interim conclusion .............................................................142

7.5.5. Final devoicing ............................................................................142 7.5.5.1. Final devoicing in stops before a vowel ............................142 7.5.5.2. Final devoicing in the list condition ..................................144 7.5.5.3. Interim conclusion .............................................................145

7.6. Discussion and conclusion .....................................................................146

CHAPTER VIII SUMMARY OF RESULTS AND IMPLICATIONS .........................149

8.1. Effect of speaking rate on voicing .........................................................150 8.2. Incomplete neutralization in cases of voice assimilation and final devoicing .......................................................................................................152

8.2.1. Results of the current study .........................................................152 8.2.2. Results of the current study and previous studies .......................153 8.2.3. Implications for phonology .........................................................156

8.3. Voicing in prepositions ..........................................................................161 8.4. Conclusions............................................................................................164

REFERENCES ................................................................................................................166

APPENDIX A LIST OF STIMULI ................................................................................176

APPENDIX B RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 1 ..........................................................................................178

Page 13: Voicing and voice assimilation in Russian stops

viii

APPENDIX C RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 2 ..........................................................................................186

APPENDIX D RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 3 ..........................................................................................189

APPENDIX E RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 4 ..........................................................................................191

Page 14: Voicing and voice assimilation in Russian stops

ix

LIST OF TABLES

Table 1. Summary of ANOVAs examining effects of underlying voicing (2 levels), speaking rate (2 levels), sonorant type (2 levels), and place of articulation (3 levels) on acoustic cues in word-initial stops. .......................38

Table 2. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), sonorant type (2 levels), place of articulation (3 levels), and underlying voice (2 levels) in word-initial stops. ......................42

Table 3. Summary of important acoustic cues as predictors of voicing in word-initial stops, pooled across all rates and contexts. ........................................43

Table 4. Summary of ANOVAs examining effects of underlying voicing (2 levels), rate (3 levels), sonorant type (2 levels), and place of articulation (3 levels) for word-medial stops. ...............................................45

Table 5. Summary of regression analyses examining effects of speaker (14 levels), rate (3 levels), sonorant type (2 levels), place (3 levels), and underlying voice (2 levels) in word-medial stops. ........................................53

Table 6. Summary of important acoustic cues as predictors of voicing in word-medial intervocalic stops, pooled across all rates and contexts. ...................54

Table 7. Means and standard deviations (in brackets) for acoustic properties of C2 stops. ........................................................................................................56

Table 8. Summary of ANOVAs examining effects of underlying voicing (2 levels), C2 voicing (2 levels), and speaking rate (2 levels) on acoustic cues in C1 stops in stop clusters. ..................................................................57

Table 9. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), C2 voice (2 levels), underlying voice (2 levels), and surface voice (2 levels) in stops in clusters. ...........................................63

Table 10. Summary of important acoustic cues as predictors of underlying voicing in C1 stops in stop clusters, pooled across all rates and contexts. ........................................................................................................64

Table 11. Summary of important acoustic cues as predictors of surface voicing in C1 stops in stop clusters, pooled across all rates and contexts. ................64

Table 12. Summary of ANOVAs examining effects of underlying voicing (2 levels), speaking rate (2 levels), and place of articulation (3 levels) on acoustic cues in final stops. ...........................................................................66

Table 13. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), place (3 levels), and underlying voice (2 levels) on voicing cues in word-final stops. .............................................................71

Table 14. Summary of important acoustic cues as predictors of underlying voicing in word-final stops, pooled across all rates and contexts. ................72

Page 15: Voicing and voice assimilation in Russian stops

x

Table 15. Percentage of C1 tokens with variation in voicing duration in slow and fast speech for 13 speakers. ........................................................................110

Table 16. Percent of modified C1 stops and sonorants in obstruent-sonorant-obstruent clusters, pooled across 14 speakers and in slow and fast rate conditions. ...................................................................................................111

Table 17. Individual results of voice assimilation in clusters with /v/ in the list and fast rate conditions. ..............................................................................127

Table B1. Mean VOT and standard deviations (in brackets) for voiced and voiceless utterance initial stops (list reading). ............................................178

Table B2. Means (ms) and standard deviations (in brackets) for VOT of word-initial (sentence-medial) voiceless stops at three places of articulation before a vowel and a consonant in slow and fast conditions. .....................178

Table B3. Means (ms) and standard deviations (in brackets) for voicing of word-initial (sentence-medial) voiced stops at three places of articulation before a vowel and a consonant. .................................................................179

Table B4. Means (Hz) and standard deviations (in brackets) for f0 and F1 after voiced and voiceless stops at three places of articulation, pooled across rates and sonorant types. ..................................................................179

Table B5. Means (ms) and standard deviations (in brackets) for voicing during closure of word-medial voiceless stops at three places of articulation before a vowel and a consonant. .................................................................180

Table B6. Means (ms) and standard deviations (in brackets) for closure duration of word-medial voiceless stops at three places of articulation before a vowel and a consonant. ...............................................................................180

Table B7. Means (ms) and standard deviations (in brackets) for closure duration of word-medial voiced stops at three places of articulation before a vowel and a consonant. ...............................................................................181

Table B8. Means (ms) and standard deviations (in brackets) for voicing during closure of word-medial voiced stops at three places of articulation before a vowel and a consonant. .................................................................181

Table B9. Mean VR and percent of fully voiced word-medial voiced stops at three places of articulation before a vowel and a consonant. .....................182

Table B10. Mean VR of word-medial voiceless stops at three places of articulation before a vowel and a consonant. ..............................................182

Table B11. Means (Hz) and standard deviations (in brackets) for f0 and F1 before (‘pre’) and after (‘post’) word-medial voiced and voiceless stops at three places of articulation. .........................................................................183

Table B12. Means (ms) and standard deviations (in brackets) for closure duration of the first (C1) stop in obstruent clusters. ..................................................183

Page 16: Voicing and voice assimilation in Russian stops

xi

Table B13. Means (ms) and standard deviations (in brackets) for voicing during closure of the underlying voiced and voiceless C1 stops in obstruent clusters. .......................................................................................................183

Table B14. Mean VR for closure duration and percent of fully voiced C1 stops in obstruent clusters. .......................................................................................184

Table B15. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding a C1 stop in obstruent clusters. ........................................184

Table B16. Means (ms) and standard deviations (in brackets) for closure duration of underlying voiced and voiceless final stops at three places of articulation. .................................................................................................184

Table B17. Means (ms) and standard deviations (in brackets) for voicing during closure of underlying voiced and voiceless final stops at three places of articulation. .............................................................................................185

Table B18. Mean VR of final underlying voiced and voiceless stops and percent of fully voiced final stops at three places of articulation. ...........................185

Table B19. Means (Hz) and standard deviations (in brackets) for f0 and F1 before underlying voiced and voiceless stops at three places of articulation. .......185

Table C1. Means (ms) and standard deviations (in brackets) of closure duration of underlying /d/ and /t/ in clusters in a word and a preposition.................186

Table C2. Means (ms) and standard deviations (in brackets) of voicing duration of underlying /d/ and /t/ in clusters within a word and a preposition. ........186

Table C3. Mean VRs in underlying /t/ and /d/ within a word and a preposition. ........187

Table C4. Percent of fully voiced underlying /d/ and /t/ in clusters within a word and a preposition. ........................................................................................187

Table C5. Means (ms) and standard deviations (in brackets) of duration of a vowel preceding underlying /d/ and /t/ in clusters within a word and a preposition...................................................................................................187

Table C6. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding underlying /d/ and /t/ in prepositions in three types of clusters. .......................................................................................................188

Table D1. Means (ms) and standard deviations (in brackets) of closure duration of underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant. .................................................................189

Table D2. Means (ms) and standard deviations (in brackets) of duration of voicing during closure of underlying /d/ and /t/ C1stops in clusters with and without an intervening sonorant. ..................................................189

Table D3. Voicing ratios of underlying /d/ and /t/ C1 stops in clusters with and without an intervening sonorant. .................................................................189

Page 17: Voicing and voice assimilation in Russian stops

xii

Table D4. Percent of fully voiced underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant. ....................................190

Table D5. Means (ms) and standard deviations (in brackets) of duration of a vowel preceding underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant. ..................................................190

Table D6. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding a C1 stop for underlying voiced and voiceless stops in obstruent-sonorant-obstruent clusters. ....................................................190

Table E1. Mean durations of stop closure (ms) and standard deviations (in brackets) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions. ...................................................................................................191

Table E2. Mean durations of voicing during closure (ms) and standard deviations (in brackets) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions. .....................................................................................191

Table E3. Mean voicing ratios (VR) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions. .....................................................................................191

Table E4. Percent of fully voiced stops before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions. .........192

Table E5. Mean durations (ms) and standard deviations (in brackets) of vowels preceding underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions. ...................................................................................................192

Page 18: Voicing and voice assimilation in Russian stops

xiii

LIST OF FIGURES

Figure 1. Examples of important acoustic measurements of voiced and voiceless stops (tokens (a) darom ‘for free’ and (b) motor ‘engine’, Speaker 2 (male), slow rate). .........................................................................................32

Figure 2. Effect of following sonorant (vowel/consonant) on VOT of (a) initial voiced and (b) voiceless stops, broken down by place of articulation. Henceforth, a * indicates a significant difference between adjacent columns. ........................................................................................................34

Figure 3. VOT distributions of Russian voiced and voiceless stops in the list condition. ......................................................................................................35

Figure 4. Effects of speaking rate and voicing on word duration. ...............................36

Figure 5. Changes in (a) voicing for word-initial (intervocalic) voiced (round markers) and in (b) VOTs for voiceless (square markers) stops in slow (light markers) and fast (dark markers) speaking rate conditions. ................40

Figure 6. Effects of speaking rate on (a) closure duration and (b) voicing duration of voiceless and voiced word-medial stops. ...................................47

Figure 7. Effect of speaking rate on (a) VOT of voiceless word-medial stops and (b) duration of a preceding vowel. .........................................................48

Figure 8. Distributions of durations of voicing for voiceless (square markers) and voiced (round markers) stops, and VOTs for voiceless stops (triangle markers) in slow (light markers) and fast (dark markers) speaking rate conditions. ...............................................................................49

Figure 9. Effects of sonorant type on (a) closure duration and (b) duration of a preceding vowel for voiceless and voiced word-medial stops. .....................51

Figure 10. (a) Sonorant voicing interaction and (b) sonorant rate interaction for duration of voicing in word-medial voiced stops. ...................................52

Figure 11. Effects of C2 voicing and speaking rate on (a) closure duration and (b) voicing duration in underlying voiced and voiceless C1 stops in a cluster. ...........................................................................................................58

Figure 12. Effects of underlying voicing and C2 voicing on (a) duration of a preceding vowel and (b) F1 frequency on underlying voiced and voiceless C1 stops in a cluster. .....................................................................59

Figure 13. Distributions of voicing during closure of C1 underlying voiceless and voiced stops before voiceless (left column) and voiced (right column) C2 stops. ........................................................................................................61

Figure 14. Distributions of durations of voicing during closure for voiceless (square markers) and voiced (round markers) C1 stops in a cluster in slow (light markers) and fast (dark markers) speaking rate conditions. .......61

Page 19: Voicing and voice assimilation in Russian stops

xiv

Figure 15. Effects of place of articulation and underlying voicing on (a) closure duration and (b) voicing into closure in final stops. .....................................67

Figure 16. Effect of speaking rate on (a) closure duration and (b) duration of a preceding vowel of word-final underlying voiced and voiceless stops. .......68

Figure 17. Voicing distributions of Russian underlying voiceless and voiced stops in slow (left column) and fast (right column) speaking rate conditions, broken down by place of articulation. ........................................70

Figure 18. Effects of (a) speaking rate and (b) environment on duration of voicing in voiced (dark bars) and voiceless (light bars) word-internal coronal stops [t, d], pooled across all speakers. ............................................73

Figure 19. Effects of (a) speaking rate and (b) underlying voicing on closure duration of underlying /d/ and /t/ in consonantal clusters. ............................83

Figure 20. Distributions of voicing during closure in underlying voiceless and voiced stops in prepositions, broken down by speaking rate (slow, fast) and following segment (sonorant, voiceless C2, voiced C2). ...............85

Figure 21. Effects of (a) following segment and (b) speaking rate on duration of voicing during closure of underlying voiced and voiceless stops in consonantal clusters. .....................................................................................86

Figure 22. The effects of (a) underlying voice and (b) following segment and on duration of a preceding vowel in prepositions and word-internally. ............89

Figure 23. Examples of C1 and C2 tokens in phrases (a) nad rtutju ‘over mercury’, spoken by S10 (m), fast rate (C1 stop closure, fully voiced, C2 closure, voiceless) and (b) nad parom ‘over steam’, spoken by S1 (f), fast rate (C1 stop closure, voiceless, C2 closure, voiceless). ..................99

Figure 24. Mean phrase duration in slow and fast speaking rate conditions................101

Figure 25. Mean VR of voiced and voiceless C2 obstruents in the slow and fast rates. ............................................................................................................102

Figure 26. Distributions of voicing during closure in presonorant /t/ and /d/ stops in obstruent-sonorant-obstruent clusters, broken down by speaking rate (slow, fast) and C2 type (voiced C2, voiceless C2). ............................104

Figure 27. Effects of cluster type (a) and speaking rate (b) on duration of voicing in C1 stops, pooled across 13 speakers. ......................................................105

Figure 28. Distributions of VR in presonorant C1 stops in clusters, broken down by C2 obstruent (voiced – upper row, voiceless – lower row) and rate (slow – left column, fast – right column), pooled across 13 speakers. .......107

Figure 29. Effects of cluster type on duration of a preceding vowel in C1 stops. .......108

Figure 30. Examples of tokens with ‘transparency effect’: (a) ot lgunji ‘from a liar’, spoken by S11 (m), fast rate (voicing of /t/ before a voiced C2)

Page 20: Voicing and voice assimilation in Russian stops

xv

and (b) nad rtutju ‘over mercury’, spoken by S9 (m), fast rate (devoicing of /d/ before a voiceless C2). ....................................................109

Figure 31. Effect of speaking rate on word duration. ...................................................118

Figure 32. Effects of following segment (a) and speaking rate (b) on closure duration in stops before /v/. ........................................................................119

Figure 33. Effects of following segment type (a) and speaking rate (b) on duration of voicing in stops before /v/. .......................................................121

Figure 34. Distributions of voicing duration of underlying /d/ and /t/ before /v/ followed by a vowel, a voiced obstruent, and a voiceless obstruent in the list (left column) and fast (right column) conditions. ...........................122

Figure 35. Effects of following segment (a) and speaking rate (b) on duration of a vowel preceding C1 stops before /v/. ..........................................................125

Figure 36. C1 stop closure, voiceless, before a vowel (from a token luk očistili ‘onion was peeled’, spoken by S6 (m), fast rate) ........................................132

Figure 37. C1 stop closure, voiceless; C2 stop, voiceless (from a token kod podobran, ‘the code is found’, spoken by S6 (m), fast rate. .......................133

Figure 38. C1 stop closure, voiced, C2 stop, voiced (from a token lug dokošen ‘the lawn is mown’, spoken by S1 (f), fast rate) .........................................133

Figure 39. Mean segment duration in three speaking rate conditions ..........................135

Figure 40. Mean VR of C2 stops in three speaking rate conditions. ............................136

Figure 41. The effect of following segment on duration of voicing in word-final C1stops pooled across all speakers and speaking rates ..............................137

Figure 42. The effects of C2 voice and speaking rate on voicing in word-final C1 stops pooled across all speakers. .................................................................138

Figure 43. Distribution of voicing durations in word-final C1 stops in (a) slow and (b) fast speech, pooled across eight speakers. ......................................139

Figure 44. Effects of C2 voicing and speaking rate on closure duration (a) and voicing (b) in word-final C1 stops. .............................................................140

Figure 45. Differences in duration of voicing in devoiced final stops before a word beginning with a vowel. .....................................................................143

Figure 46. Effect of underlying voicing on the duration of a preceding vowel before final stops before a word beginning with a vowel. ..........................144

Figure 47. Effects of following segment and underlying voice on voicing duration in devoiced final stops in the list condition. .................................145

Figure 48. Effects of following segment and underlying voice on duration of a preceding vowel before devoiced final stops in the list condition. .............145

Page 21: Voicing and voice assimilation in Russian stops

1

1

CHAPTER I

INTRODUCTION

This dissertation investigates voicing properties of Russian stops and the

processes that involve voiceless and voiced stops: final devoicing and voice assimilation.

The specific purpose of the study is to establish what the most important acoustic

correlates of the voicing contrast in Russian are and to discuss phonological implications

of the results.

The statement that claims about phonological processes must be based on reliable

data seems obvious; yet, too often in the past, phonologists have relied on descriptive

grammars and impressionistic transcriptions of pronunciations instead of more accurate

instrumental analysis. For Russian, only a few studies have addressed some of the basic

questions about the acoustic properties of voicing (Halle 1959; Jones and Ward 1969;

Bolla 1981; Barry 1988, 1995; Chen 1970; Burton and Robblee 1997; Ringen and

Kulikov, in press).

Recent studies of the voicing properties in Germanic languages (Jessen and

Ringen 2002; Helgason and Ringen 2008; Beckman, Jessen, and Ringen 2009) have

shown that instrumental measurements often reveal that the output of the phonology is

not what is expected, given accepted phonological analyses. Observations of the

pronunciation of Russian speakers and my preliminary study (Kulikov 2010) suggest that

not all claims about the voicing properties of obstruents and voice assimilation in Russian

(e.g. Hayes 1984) are accurate. Yet, these data have been used to support important

theoretical claims about the nature of voicing in human language (Kiparsky 1985;

Lombardi 1991, 1999; Steriade 1999; Rubach 2008).

In this thesis, I address the following questions: Is there complete devoicing of

word-final stops in Russian? Is voice assimilation complete in word-internal obstruent-

obstruent clusters and in clusters across a word boundary? What are the phonetic

Page 22: Voicing and voice assimilation in Russian stops

2

2

properties of consonants in obstruent-sonorant-obstruent clusters and what causes the

“sonorant transparency” to voice assimilation?

The structure of the dissertation is as follows. In Chapter 2, I present the data

illustrating voiced and voiceless stops in Russian, acoustic cues to voicing, and the

methodology of the study. Chapters 3, 4, 5, 6 and 7 present results of the five experiments

which investigate the voicing properties of Russian stops. The first experiment examines

the voicing properties of Russian obstruents and regular cases of voice assimilation word

internally in slow and fast speech. The second experiment examines voice assimilation

across clitic and word boundaries. The third experiment examines voice assimilation

through a sonorant in slow and fast speech. The fourth experiment examines voice

assimilation before clusters with /v/. The fifth experiment focuses on voice assimilation

and final devoicing across a word boundary at the phrase level. In Chapter 8, I summarize

the results and discuss the implications for the phonological analysis of voicing and voice

assimilation.

Page 23: Voicing and voice assimilation in Russian stops

3

3

CHAPTER II

THE VOICING CONTRAST AND ACOUSTIC CUES FOR VOICING

2.1. Voiced and voiceless stops in Russian

Russian is a language that has usually been described as having a laryngeal

contrast between voiceless unaspirated and fully voiced stops. The voicing contrast in

Russian is similar to other languages (e.g. Spanish, French) that have voiceless

unaspirated and fully (pre)voiced utterance initial voiced stops (Lisker and Abramson

1964). Such languages will be referred to as ‘true voice’ languages. In the following

sections, I present what are widely claimed (e.g. Avanesov 1968 among others) to be the

facts about voiced and voiceless stops in Russian. However, since stops and fricatives in

Russian usually behave in the same fashion in voicing processes, the more general term

‘obstruent’ is sometimes used in the text whenever the distinction between stops and

fricatives is not relevant.

2.1.1. Word-initial and word-medial stops

Russian has a contrast at three places of articulation between voiceless

unaspirated stops /p/, /t/, /k/ and fully voiced stops /b/, /d/, /g/. Vowels and sonorants are

usually described as not triggering voice assimilation; therefore, the contrast in initial

stops is preserved in a prevocalic/presonorant position, as shown in (1).

(1) /p/alka [p] ‘stick’ /b/alka1 [b] ‘beam’

/p/ravyj [p] ‘right’ /b/ravyj [b] ‘brave’

/t/elo [t] ‘body’ /d/elo [d] ‘business’

1 Speech sounds are represented by IPA symbols given in square brackets. To represent Cyrillic

letters that are not present in the Roman alphabet, the following equivalents are used: c = [ʦ], š = [ʃ], ž =

[ʒ], č = [ʧ], x = [x], j = [j], y = [ɨ]. An apostrophe after a letter represents phonological palatalization, e.g.

l’= [lj]. Slashes // are used for underlying, phonemic representation.

Page 24: Voicing and voice assimilation in Russian stops

4

4

/t/rava [t] ‘grass’ /d/rova [d] ‘firewood’

/k/onec [k] ‘end’ /g/onec [g] ‘courier’

/k/rov’ [k] ‘blood’ /g/rom [g] ‘thunder’

The contrast is also maintained in intervocalic/intersonorant position word-

internally (2).

(2) za/p/il [p] ‘washed down’ masc. za/b/il [b] ‘scored’ masc.

vo/p/ros [p] ‘question’ o/b/lako [b] ‘cloud’

le/t/ok [t] ‘bee-entrance’ le/d/ok [d] ‘(thin) ice’

me/t/la [t] ‘broom’ ve/d/ro [d] ‘pail’

lu/k/a [k] ‘onion’ Gen.sg. lu/g/a [g] ‘lawn’ Gen.sg.

o/k/ras [k] ‘color’ mo/g/li [g] ‘could’ Past.pl.

Word-internal morpheme boundaries are usually invisible for voicing processes.

For example, final obstruents in prefixes preserve underlying voicing when occurring in a

prevocalic/presonorant position, as shown in (3).

(3) po/d+/opytnyj [d] ‘experimental’

o/t+/lamyvat’ [t] ‘to break off’

2.1.2. Word-final devoicing

Voiced obstruents in Russian are usually described as undergoing devoicing when

they occur word-finally, as shown in (4).

(4) du/b/ [p] ‘oak’ Nom.sg. c.f. du/b/a [b] Gen.sg.

sa/d/ [t] ‘orchard’ Nom.sg. sa/d/a [d] Gen.sg.

lu/g/ [k] ‘meadow’ Nom.sg. lu/g/a [g] Gen.sg.

Word-final devoicing does not occur in prepositions when they are used in

prepositional phrases, as illustrated in (5). Prepositions in Russian behave as proclitics,

which do not constitute separate Prosodic Words (see Selkirk 1995 for a detailed analysis

of prosodic constituents). In phonological processes that involve laryngeal features, the

Page 25: Voicing and voice assimilation in Russian stops

5

5

boundary between a preposition and a content word is treated as a word-internal

morpheme boundary between a prefix and a root. Final obstruents in prepositions

preserve their underlying voicing before words beginning with vowels or sonorants, as

illustrated in (5), similar to the prevocalic obstruents in prefixes in (3).

(5) po/d/ # uglom [d] ‘at an angle’

o/t/ # okna [t] ‘from the window’

Final devoicing is, however, found in prepositions that are used as separate words,

as shown in (6).

(6) voda i po/d/ [t] i na/d/ [t] ‘water both under and over’

Some clitic boundaries do trigger final devoicing. Final obstruents that precede

enclitics (e.g. interrogative particle li ‘if/whether’) regularly devoice, as shown in (7).

(7) du/b/ li [p] ‘whether the oak’

sa/d/ li [t] ‘whether the orchard’

While proclitics are ‘affixal’ clitics, which are prosodified with the following noun under

a Prosodic Word, enclitics are ‘free’ clitics, which are attached directly to a Phonological

Phrase. 2

2.1.3. Voice assimilation

Russian is usually described as having regressive voice assimilation in obstruent

clusters. Two or more obstruents in a cluster are described as having the same

specification for voice, which is determined by the laryngeal specification of the

2 According to Selkirk (1995:450), the major distinction between ‘free’ clitics and ‘affixal’ clitics

is in the domain of stress. ‘Free’ clitics are never stressed, while ‘affixal’ clitics can be stressed within a

word. Another option – ‘internal’ clitics, which are prosodified within the same Prosodic Word, – is ruled

out in Russian because the clitic-word boundary in this case is word-internal. However, it has been shown

(e.g. Rubach 2000) that some phonological processes across the proclitic-word boundary (e.g.

palatalization) produce the same results as across a boundary between two content words.

Page 26: Voicing and voice assimilation in Russian stops

6

6

rightmost obstruent in a cluster.3 Thus, both voicing of an underlying voiceless stop and

devoicing of an underlying voiced stop are attested. Word-internally, voice assimilation

occurs within a morpheme (8a) and across morpheme boundaries (8b).

(8) a. /vs/e [fs] ‘all’ (pl.) c.f. /v/es’ [v] ‘all’ (sg.)

b. sva/t+’b/a [d

jb] ‘wedding’ sva/t/+at

’ [t] ‘to ask in marriage’

le/d+k/a [tk] ‘ice’ Gen.sing.dim. le/d/+ok [d] Nom.sing.dim.

Voice assimilation is claimed to occur across a clitic boundary. Final obstruents in

prepositions (proclitics) agree in voicing with the root-initial obstruent, as illustrated in

(9). Initial obstruents in enclitics (e.g. the irrealis particle by or the emphatic focus

particle to) trigger voice assimilation of the final obstruent in a preceding content word,

as shown in (10).

(9) o/t g/oroda [dg] ‘from the city’ c.f. o/t/ ugla [t] ‘from the corner’

/k z/emle [gz] ‘to the ground’ /k/ otcu [k] ‘to the father’

(10) ko/t/ # /b/y [db] ‘the cat would…’ c.f. ko/t/u [t] ‘cat’ Dat.

du/b/ # /t/o [pt] ‘oak in fact…’ du/b/y [b] ‘oak’ pl.

Voice assimilation can also occur across a word boundary. Recall that obstruents

undergo devoicing word-finally. As a result, devoicing occurs in utterance-final

obstruents, which was shown in (4), as well as in words in a phrase when the following

word begins with a sonorant segment, as illustrated in (11).

(11) du/b/ # upal [p] ‘the oak tree fell’

However, if a word-final stop occurs before a word that begins with an obstruent,

this stop usually assimilates, as illustrated in (12). Voice assimilation overrides final

devoicing so that all obstruents in a sequence agree in voicing.

3 In this study, the leftmost obstruent (stop) in an obstruent cluster is schematically represented as

C1, and the rightmost obstruent is represented as C2.

Page 27: Voicing and voice assimilation in Russian stops

7

7

(12) ko/t/ ## /b/yl [db] ‘the cat was’

lu/g/ ## /p/okošen [kp] ‘the meadow was mowed’

Voice assimilation across a word boundary in Russian is reported to occur

inconsistently. Transcriptions of cases with obstruent clusters across a word boundary in

Avanesov (1968) show both devoiced and voiced obstruents before the voiced word-

initial obstruent which is supposed to trigger voice assimilation.4 The examples in (13)

are the same as those in (12), the only difference being that a pause occurs between the

two words, which causes word-final devoicing of a final obstruent in the first word in a

phrase.

(13) ko/t/ ## /b/yl [t | b] ‘the cat was’

lu/g/ ## /p/okošen [k | p] ‘the meadow was mowed’

Among the factors that affect voice assimilation across a word boundary,

researchers mention stress (Baranovskaja 1968, Shapiro 1993), speech tempo (Kn’azev

2004), semantics (Baranovskaja 1968), and individual variation (Paufošima and

Agaronov 1971). Assimilation is claimed to be more likely to occur when the words in

the phrase have one primary stress, constitute an idiom, or are pronounced in close

contact to each other in fast speech.

2.1.4. Sonorant transparency

Obstruents before sonorants (including consonants) are generally claimed to

retain their underlying laryngeal specifications, as shown in (14).

(14) /k/rot [k] ‘mole’ o/t/ nas [t] ‘from us’

/g/rot [g] ‘cave’ po/d/ nami [d] ‘under us’

4 Examples with no assimilation include [stix daidjot] ‘the verse will reach’, [rot gatovi] ‘kin

ready’, [kak zima] ‘as winter’, [ʃak galanskij] ‘pace of Holland’, and [vjek daʒivatj] ‘live out one’s days’.

Examples with assimilation include [kag daxodit] ‘as it reaches’, [drug gdrudu] ‘to one another’ (Avanesov

1968).

Page 28: Voicing and voice assimilation in Russian stops

8

8

In some cases, however, obstruents in Russian have been reported to change their

voicing specifications before sonorant consonants. Jakobson (1978) reports that in his

speech, final obstruents in prepositions, e.g. iz ‘from/out of’ and ot ‘off/from’, agree in

voicing with the obstruents following the sonorant instead of preserving their laryngeal

specification in presonorant position, which is shown in (15).

(15) i/z mts/enska [smts] ‘from Mcensk’ (place name)

o/t mg/ly [dmg] ‘from haze’

He adds, however, that his idiolect is not always consistent with Standard Russian.

Following Jakobson, Hayes (1984) argues that sonorant transparency in voice

assimilation is a phonological process in fast speech. Unfortunately, he does not give

results of acoustic measurements or any demographic information about his language

consultants and what variety of Russian they speak. Nor does he provide any information

about how regular this process is.

Other linguists question existence of sonorant transparency in Russian (e.g.

Es’kova 1971, Shapiro 1993, Kavitskaya 1999). This process is unusual from the

typological perspective, since the cases reported by Jakobson are restricted to a clitic

boundary in a prepositional phrase. No assimilation through a sonorant has been reported

within a word. For example, an initial voiceless obstruent in Russian last names of Polish

origin Kržyžanovskij, Prževalskij is never voiced before a voiced obstruent that follows a

sonorant. Voice assimilation through a sonorant has never been reported across a word

boundary, either (16).

(16) ka/dr/ # /p/lox [drp] ‘the frame is bad’

sv’o/kr/ # /b/olen [krb] ‘the father-in-law is sick’

Robblee and Burton (1997) found no evidence for sonorant transparency in clusters in

slow speech. No detailed instrumental study of these clusters was done in fast speech,

however. In spite of these doubts, claims about voice assimilation through a sonorant are

usually included in phonological descriptions of Russian (e.g. Petrova 2003, Rubach

Page 29: Voicing and voice assimilation in Russian stops

9

9

2008), and sonorant transparency has been used to support important theoretical claims

(Kiparsky 1985, Steriade 1999).5

2.1.5. Voicing contrast before /v/

The voiced labiodental fricative /v/ in Russian has been reported to have

properties of both an obstruent and a sonorant (Jakobson 1968). Like any voiced

fricative, it undergoes word-final devoicing (17a). It also assimilates to the following

voiced or voiceless obstruent within a Phonological Word (17b).

(17) a. le/v/ ## [f] ‘lion’ Nom.sg. c.f. l/v/a [v] ‘lion’ Gen.sg.

le/v # l/i [fl] ‘whether the lion’

b. /v d/ome [vd] ‘in the house

/v p/arke [fp] ‘in the park’

However, /v/ is claimed to have properties of a sonorant in terms of its position:

presonorant /v/ does not trigger voice assimilation in word-internal clusters (18) or across

a clitic boundary (19), as do other obstruents.

(18) /t/vorec [t] ‘creator’ vs. /d/vorec [d] ‘palace’

(19) o/t v/olgi [tv] ‘from the Volga’

na/d v/olgoj [dv] ‘over the Volga’

When /v/ occurs before an obstruent, it assimilates and triggers voice assimilation

of a preceding obstruent, as demonstrated in (20).

(20) o/t vd/ov [dvd] ‘from widows’

na/d vt/ornikom [tft] ‘over Tuesday’

5 Rubach (1996, 1997) argues that there is sonorant transparency in Polish. Voice assimilation

through a sonorant consonant is reported to be a regular process which occurs both word-internally and

across a word boundary. Sonorants are claimed to be transparent to regressive and progressive voice

assimilation. However, the preliminary acoustic analysis of Polish in Strycharczuk (2010a,b) raises

questions about whether sonorant transparency exists in Polish. Answers must await further acoustic

analysis.

Page 30: Voicing and voice assimilation in Russian stops

10

10

The claims about voice assimilation before /v/ are not consistent. Reformatskii

(1975) argues that voice assimilation before /v/ is incomplete or optional because the

position before /v/ is contrastive in Russian even in cases when /v/ is not presonorant. In

line with this claim, Panov (1968) argues that voice assimilation before /v/ is optional

even in cases when /v/ occurs before another obstruent. He reports cases when speakers

of Russian do not assimilate stops in prepositions before /v/, e.g. o/t # vd/ov [tvd] ‘from

widows’ (c.f. o/t # vd/ov [dvd], as previously shown in 20).

2.2. Acoustic cues to voicing

Acoustic cues to voicing reflect temporal and spectral differences between voiced

and voiceless stops. The temporal cues usually discussed in the literature include voice

onset time (henceforth, VOT), (stop) closure duration, duration of voicing during closure;

and burst duration. In addition, duration of a preceding vowel is claimed to be an

important cue to voicing in intervocalic stops. Spectral cues include burst energy, as well

as fundamental frequency (hence, f0) and frequency of the first formant (hence, F1) of a

following vowel.

2.2.1.Voice onset time (VOT)

VOT, usually defined as the time between stop release and onset of vocal fold

vibration, has been shown to be a cue to voicing contrasts (Lisker and Abramson 1964).

In utterance-initial position, stops can be produced with voicing that starts before the

release (prevoiced stops with negative VOT), with voicing that starts immediately after

the release (voiceless unaspirated stops with short-lag VOT), and with voicing that starts

a relatively long time after the release (voiceless aspirated stops with long lag VOT).

Lisker and Abramson demonstrated that languages tend to have contrastive categories of

stops based on differences in VOT. Some languages (e.g. Dutch, Spanish, Hungarian,

Tamil), true voice languages, have a two-way contrast between prevoiced and voiceless

unaspirated stops. Other languages (e.g. English, Cantonese), aspirating languages, have

Page 31: Voicing and voice assimilation in Russian stops

11

11

a two-way contrast between voiceless unaspirated6 and voiceless aspirated stops.

Languages with a three-way contrast (e.g. Eastern Armenian, Thai) have all three

categories of stops – prevoiced, voiceless unaspirated, and voiceless aspirated.

VOT was shown to be dependent on place of articulation: velar stops tend to have

longer positive VOT and shorter negative VOT in comparison with bilabial and coronal

stops. This pattern is consistent cross-linguistically (Cho and Ladefoged 1999). Variation

in VOT is dependent upon context. Positive VOT is usually longer before high vowels

than before low vowels (Cooper 1974, Summerfield 1974). Positive VOT is shorter

before vowels than before sonorant consonants (Klatt 1975), while negative VOT is

reported to be longer before vowels than before sonorant consonants (van Alphen and

Smits 2004, but see Ringen and Suomi 2012 for absence of this effect in Fenno-Swedish).

VOT is shown to be a very salient perceptional cue. Perception of this cue can be

observed even in infants (Eimas et al 1971). In slow speech, the category boundary

between unaspirated and aspirated stops in English is usually reported to be around 30 ms

(e.g. Summerfield 1981, among others). Speakers of English consistently identify a stop

as “voiced” (i.e. lenis ‘b, d, g’) if aspiration is absent (Lisker 2003). The absence or

presence of prevoicing does not seem to be relevant for the correct identification of /p, t,

k/ in English. Similar results were obtained for German, another aspirating language

6 I will use a notation for underlying representations that is based on (assumed) phonological

specification: prevoiced stops in true voice languages specified with [voice] are represented as /b, d, g/;

unspecified stops (e.g. voiceless unaspirated stops in Russian, as well as lenis stops in aspirating languages,

which are traditionally called “voiced” in English or German) are represented as /p, t, k/; voiceless

aspirated stops in English or German are represented as /ph, t

h, k

h/. Many studies have investigated voicing

using English stops as examples and the authors of such studies often call stops in the lenis series “voiced”.

Although it is possible to find lenis stops in English or German with prevoicing or intervocalic voicing,

presence of vocal fold vibration in these stops is optional. In this study, the term “voiced” is used only for

stops that are underlyingly specified with the feature [voice] or for phonetic forms with vocal fold

vibration. Taking a neutral stand in the matter of phonological analysis of stops in aspirating languages, I

will use the term “lenis” for the unaspirated series in English or German and the term “fortis” for the

aspirated one. Other authors (e.g. Docherty 1992) use VOICE for the lenis series.

Page 32: Voicing and voice assimilation in Russian stops

12

12

(Jessen 1998). Speakers can adjust perception of VOT at different speaking rates.

Summerfield (1981) reports that speakers usually associate longer VOT values with

slower speech and thus tend to interpret ambiguous VOTs as aspiration in faster speech.

Speakers have been shown to be sensitive to VOT that they hear on an every-day

basis. Adaptation to a different categorical boundary at one place of articulation leads to

changes in perception (Eimas and Corbit 1973) and production (Nielsen 2011) in other

categories. This change is observed in cross-linguistic contacts. Use of a language with a

different VOT can trigger a change in VOT in a native language (Sancier and Fowler

1997; Chang 2012). Long-term exposure to an aspirating language (e.g. in areas of

language contact) can result in consistent changes of VOT in a native non-aspirating

language. Bilingual speakers of Spanish or French with L2 English produce VOTs in

their native languages that have values closer to English than to Spanish or French

(Caramazza and Yeni-Komshian 1974, Flege 1991). Fowler et al (2008) argue, however,

than mere exposure to a language does not undermine the underlying system. Speakers

have to speak the other language. Fowler et al show that simultaneous English-French

bilinguals in Canada have VOT values different for both languages than monolingual

speakers of these languages, but VOT of monolingual French- and English-speaking

Canadians does not differ from VOT of monolingual French speakers in France or

monolingual English speakers in the US. Acquisition of L2 VOT values was shown to be

gradient: more proficient learners establish category boundaries very close to the one in

L2, while less proficient learners use intermediate VOT values (Flege and Eefting 1986,

Hazan and Boulakia 1993).

Studies of VOT in true voice languages show that positive VOT is context-

dependent: it is longer before high vowels than before low vowels in Hungarian (Gósy

2001) and Canadian French (Nearey and Rochet 1994). Negative VOT can also depend

upon context. Van Alphen and Smits (2004) report that prevoicing in Dutch is shorter

before a sonorant consonant (M=99 ms, SD=24) than before a vowel (M=117 ms,

Page 33: Voicing and voice assimilation in Russian stops

13

13

SD=29). The category boundary is reported in the range between –20 and –10 ms (e.g.

Williams 1977a,b for Spanish, Hazan and Boulakia 1993 for French, Van Alphen and

Smits 2004 for Dutch). Speakers of true voice languages usually perceive a stop as

voiced if it has prevoicing (Lisker 2003).

2.2.2. F1 frequency

The value of F1 is an established cue to a laryngeal contrast (Liberman et al 1958,

Summerfield and Haggard 1977). Lower F1 frequency on an adjacent vowel is caused by

the larynx lowering gesture due to lengthening of the vocal tract (Westbury 1983). This

gesture expands the supralaryngeal cavity and facilitates voicing (see Ladefoged 1967,

Ohala 1972, Hombert et al 1979 among others). In addition, speakers of English (e.g.

Stevens and Klatt 1974) and other aspirating languages (e.g. Pind 1999 for Icelandic) are

sensitive to the length of F1 transition as well as to absolute F1 values. Because aspirated

stops in English have long positive VOT, formant transitions usually occur before vowel

onset. Vowels after voiceless unaspirated stops, in contrast, have short formant

transitions; thus, speakers interpret absence of audible F1 transition vs. short F1 transition

as a cue to the laryngeal contrast.

F1 can affect perception of VOT and the two cues have been shown to have a

trading relationship in perception in English (Lisker 1975). It is possible, however, that

speakers of true voice languages use the F1 cue differently because vowel onset occurs

immediately after release in both voiced and voiceless stops. Benkí (2005) claims that F1

transition is a universal cue to voicing but it can be used differently in true voice and

aspirating languages. He argues that the difference between a true voice language

(Spanish) and an aspirating language (English) is in the relationship between F1 and

VOT. Speakers of Spanish show sensitivity to F1 only when voiced and voiceless stops

have ambiguous VOTs (i.e. positive short lag) whereas speakers of English categorize

tokens with higher F1 as aspirated even if they have unambiguous short lag VOTs.

Page 34: Voicing and voice assimilation in Russian stops

14

14

2.2.3. Fundamental frequency (f0)

Vibration of the vocal folds causes lower f0 frequency on the following vowel due

to less vertical tension in vocal cords (vocal cord slacking). Speakers use differences in

pitch after stop release to discriminate between voiced and voiceless stops (Haggard et al

1970). Since vocal cord slacking in voiced stops often co-occurs with the larynx lowering

gesture, it is not unusual that some authors (e.g. Kingston 2005) consider f0 and F1 as

one, integral cue. However, unlike F1, f0 seems to be independent of VOT. Ohde (1984)

argues that f0 values after stop release in English depend on phonological category rather

than on context-dependent VOT duration. Speakers of English produce relatively high f0

values after voiceless unaspirated stops in s-clusters (e.g. [t] in ‘stop’), as well as after

aspirated stops (e.g. [th] in ‘top’) in spite of their differences in VOT duration. On the

other hand, speakers produce lower f0 after initial lenis stops (e.g. [t] in ‘doll’) although

these stops have the same VOT as do stops in s-clusters.

Cross-linguistically, lower f0 values usually correlate with voicing and higher f0

values are usually found with voiceless stops. This has motivated some authors (e.g.

Kingston and Diehl 1994) to claim that lower f0, found in English lenis stops, is

characteristic of the underlying voicing contrast in English. Yet, differences in f0 are not

bound exclusively to the contrast between voiced and voiceless stops. Cho et al (2002)

report that speakers of Korean use f0 as a cue to distinguish between initial voiceless

unaspirated /p, t, k/ and voiceless aspirated /ph, t

h, k

h/, with both series pronounced

without vocal fold vibration.

2.2.4. Duration of a preceding vowel

Duration of a vowel correlates with the laryngeal contrast in a variety of

languages. Peterson and Lehiste (1960) report that duration of a preceding vowel is

considerably (60%) longer in English before final lenis stops than before fortis stops. The

difference in vowel duration is smaller (20%) but still consistent before intervocalic

Page 35: Voicing and voice assimilation in Russian stops

15

15

stops, which are often phonetically voiced in this position (Sharf 1962). Other languages

(e.g. French, Russian, Korean) also exhibit this tendency (Chen 1970). The same

tendency was observed in Spanish (Zimmerman and Sapon 1958), German and

Norwegian (Fintoft 1961). The authors report that vowels are, on average, 10-30% longer

before voiced stops than before voiceless stops, with absolute values for difference

varying from 18 to 53 ms, although the authors do not specify whether this difference in

vowel duration is a function of phonetic or phonological voicing, or both.

Chen (1970) argues that the most plausible explanation for such differences is a

universal articulatory mechanism of transition from vowel to consonant closure.

Nevertheless, the English example demonstrates that implementation of this mechanism

can be language-specific. Other authors (e.g. Laeufer 1992) argue that although

languages may vary in absolute values and ratios of difference in vowel duration before

voiced and voiceless stops, it is still universal. Laeufer claims that duration of vowels in

French and English is affected by identical contextual factors, such as syllabification and

speaking rate.

It is not clear whether difference in vowel duration is a perceptual cue across

languages. In English, where the difference is great, it is an important cue in perception

of “voicing” (see Jongman et al 1992, among others). In most languages examined in

Chen (1970), in contrast, the difference is below or just at the threshold of human

perception. In addition, some factors can attenuate the effect of voicing on vowel

duration. The effect of voicing is not found in languages with contrastive vowel length

(Keating 1985, although see Campos-Astorkiza 2006 for conditions of such an effect).

2.2.5. Voicing during closure

Sparked by the seminal paper by Lisker and Abramson (1964), researchers have

focused almost exclusively on studies of VOT in utterance initial stops. Few studies have

investigated voicing in intervocalic stops and most of these studies have examined stops

Page 36: Voicing and voice assimilation in Russian stops

16

16

in English (see Lisker and Abramson 1967 for American English and Docherty 1992 for

British English) or German (Beckman et al, in press). Voiced and voiceless stops in the

intervocalic position differ in the length of voicing during closure. Voiceless stops often

have a short voicing tail into closure; voiced stops have vibration of the vocal folds,

which may not last for the whole closure. Jessen and Ringen (2002) found that so-called

“voiced”, or lenis intervocalic stops in German are often produced without robust voicing

during closure. Beckman et al (in press) report that only 62.5% of German lenis/lax stops

were produced with voicing for more than 90% of their closure. They argue that variation

in voicing during closure occurs because the feature of contrast in German is [spread

glottis], not [voice]. They suggest that little variation occurs when speakers are aiming to

voice a stop, i.e. when the feature of contrast is [voice].

A tendency to break voicing during closure, similar to that found in German, was

found in intervocalic stops in English, another aspirating language. Lisker and Abramson

(1967) and Docherty (1992) report that only about 50% of English word-initial stops in

connected speech (i.e. in an intervocalic environment) are voiced throughout the entire

closure, the rest have broken voicing, i.e. voicing was interrupted in the middle of the

closure. Duration of voicing during closure may be context-dependent. Docherty (1992)

found that lenis stops in English have broken (i.e. shorter) voicing more often before

sonorant consonants than before vowels.

The patterns found in aspirating languages such as English or German are

different from patterns observed in true voice languages. In Russian, a true voice

language, broken voicing in intervocalic stops is not a pervasive pattern. Barry (1995)

and Ringen and Kulikov (in press) report that 96-98% of voiced intervocalic stops are

fully voiced, i.e. produced with unbroken voicing.

Page 37: Voicing and voice assimilation in Russian stops

17

17

2.3. Voicing contrasts and phonological features

The distinction between voiced and voiceless obstruents in languages has been

traditionally represented in phonology with a binary feature [±voice] (Keating 1984,

Kingston and Diehl 1994). The category orthographically represented as ‘b, d, g’ is specified

as [+voice], and the other category, orthographically represented as ‘p, t, k’ is [–voice].

Under this theory the differences in actual phonetic shape of phonologically [+voice] and [–

voice] stops are characterized as differences in phonetic implementation of the phonological

feature, VOT being the parameter of variation. The feature [+voice] in Polish, a true voice

language, is implemented by (pre)voiced stops, but in English it is usually implemented by

voiceless unaspirated stops. Similarly, [–voice] is implemented with both voiceless aspirated

stops and voiceless unaspirated stops (as in clusters with ‘s’) in English, but with voiceless

unaspirated stops in Polish.

The binary feature [±voice] is, however, unable to account for a wide variety of

laryngeal contrasts across languages. Halle and Stevens (1971) propose four binary

laryngeal features that capture distinctions in width and constriction of the glottis

([±spread glottis], [±constricted glottis]), as well as slackness of the vocal folds

([±slack]), which facilitates vibration, and stiffness ([±stiff]), which inhibits vocal fold

vibration. According to Halle and Stevens (1971: 52), long positive VOT is attributed to

width of the glottal opening during closure and release; aspiration is thus a direct result of

turbulent noise originated in a spread glottis (Kim 1970). Voiceless aspirated stops in

aspirating languages are [+stiff, –slack, +sg, –cg]; voiceless unaspirated stops in

languages like English are [–stiff, –slack, –sg, –cg]. Their [–stiff, –slack] specification

indicates that they can be produced with vocal fold vibration.

Voiceless unaspirated stops and voiced stops in true voice languages share the

same [–sg] specification for glottal opening but they are different in the state of the vocal

folds. Halle and Stevens (1971) argue that voiced stops in true voice languages are

specified as [–stiff, +slack, –sg, –cg] and voiceless stops are specified as [+stiff, –slack,–

Page 38: Voicing and voice assimilation in Russian stops

18

18

sg, –cg]. The feature [+stiff] ensures that voiceless stops in these languages are produced

without vocal fold vibration.

Although Halle and Stevens’ system accounts for some laryngeal contrasts, it

predicts glottal specifications that are not attested in languages (e.g. [+stiff, +slack, +sg,

+cg]). Another proposal for laryngeal features has been widely accepted. Researchers

such as Lombardi (1995, 1999) and Iverson and Salmons (1995) argue that laryngeal

features are privative and have only one value. It is claimed that only positive values are

active in phonological processes and can be referred to in phonological rules. One crucial

question, however, is what the grounds are for a particular laryngeal feature in different

languages.

Jessen and Ringen (2002) and Beckman et al (in press) argue that a contrastive

feature is grounded in laryngeal gestures. Aspirated stops, specified with [spread glottis],

have an active glottal opening and voiced stops, specified with [voice], have active

voicing gestures, such as vocal fold slacking and vibration, usually accompanied by

tongue root advancement, or lowering of the larynx, as shown in Westbury (1983).

Voiceless unaspirated stops, the most common and unmarked category of stops cross-

linguistically, are not specified.7

Under the privative feature theory, the phonological representation of Russian

voiced stops, which have robust prevoicing in utterance-initial position and robust

voicing in intervocalic stops (Ringen and Kulikov, in press) would be specified with the

feature [voice]. Voiceless unaspirated stops would be unspecified. Following Mester and

Itô’s (1989) and Cho’s (1990) approach to assimilation, the feature [voice] is shared by

spreading in the process of voice assimilation. Devoicing in word-internal clusters, which

7 To account for implosive stops in languages or for consonant-vowel height interactions as in

Madurese, some authors add the feature [larynx height] (Avery and Isardi 2001) or [lowered larynx] (Cohn

and Lockwood 1994) to the laryngeal node in addition to the features [sg], [voice], and [cg].

Page 39: Voicing and voice assimilation in Russian stops

19

19

are phonetically realized as voiceless, is represented as feature delinking so that both

stops in a cluster lose laryngeal specification. Devoiced final stops also are delinked and

hence unspecified.

2.4. Voicing contrasts and speaking rate

Studies of the effect of speaking rate on production of the laryngeal contrast (e.g.

Kessinger and Blumstein 1997, Magloire and Green 1999, Solé and Estebas 2000) show

that manipulation of speaking rate asymmetrically affects voicing categories. In

languages with a two-way contrast, speakers lengthen VOT in slower speech only in one

category: speakers of English lengthen aspiration, but speakers of French, Spanish, or

Catalan lengthen prevoicing.

It is less clear why one but not the other category is affected by rate. Several

proposals have been introduced to account for this effect. Beckman et al (2011) argue

that effects of speaking rate are found in the acoustic cues that correlate with active

phonological features. For the laryngeal contrasts, the privative phonological feature

[spread glottis] underlies a contrast in aspirating languages (e.g. German, English,

Icelandic), and the feature [voice] underlies a contrast in true voice languages (e.g.

Spanish, French, Russian). Voiceless unaspirated stops are not specified for a laryngeal

feature in any languages (Iverson and Salmons 1995). Hence, VOTs in aspirated stops

and prevoiced stops are longer in slow speech, but no change occurs in the short-lag VOT

of voiceless unaspirated stops. This makes the correct prediction for systems with a three-

way contrast (e.g. Thai). In these systems, aspirated stops are specified with [sg], voiced

stops are specified with [voice], and voiceless unaspirated stops are unspecified.

Therefore, only VOT of phonologically specified stops (i.e. voiced and aspirated) should

be affected by speaking rate. Kessinger and Blumstein (1997) report that in Thai, a

language with three series of stops, speakers lengthen VOT in initial voiceless aspirated

Page 40: Voicing and voice assimilation in Russian stops

20

20

stops and in prevoiced stops, but voiceless unaspirated stops are not affected by changes

in speaking rate.

2.5. Effects of speaking rate and environment on voicing in

Russian: Rationale for the study

The overview presented in sections 2.1.3 – 2.1.5 demonstrates that the most

controversial cases of voicing in Russian (i.e. cases of so-called “sonorant transparency”

and incomplete voice assimilation in prepositions or before /v/) emerge when contextual

effects on voicing interact with effects of speaking rate. First, cases of “sonorant

transparency” are claimed to be observed only in fast speech (Hayes 1984). Next, results

of voice assimilation in obstruent clusters are different in slow and fast speech (Kulikov

2010). In addition, the results in van Alphen and Smits (2004) suggest that acoustic

parameters of phonetic voicing in stops may vary before vowels and sonorant consonants.

Therefore, a proper analysis of voicing in Russian requires a detailed examination of

acoustic cues for voiced and voiceless stops in different positions and at different

speaking rates; yet, no such systematic data exist. Although studies of voicing contrasts

in the world’s languages are numerous, most of them were done on English, an aspirating

language; thus, their results cannot be automatically applied to cases of true voice

languages. Speakers of languages parse and interpret the same cues differently,

depending on the phonology of a language (Gow 2003).

The study reported in the following chapters, which investigates acoustic cues for

voicing in Russian, is made with the following assumptions:

1. Contrastive features are grounded in articulatory gestures. The feature [voice]

in a phonological system correlates with active voicing gestures. When stops are

phonologically specified with [voice], this means that speakers usually actively use the

articulatory gesture to produce voicing. This may involve active vocal fold adduction and

the appropriate configuration of the larynx that facilitates vocal fold vibration, i.e.

Page 41: Voicing and voice assimilation in Russian stops

21

21

lowering of the larynx, vocal fold slacking, etc. (Ladefoged and Maddieson 1996). The

cues to such a contrast should capture the differences in the primary gesture. Hence, the

cues to voicing in stops are expected to vary along the parameters that distinguish the

configuration of the vocal tract and larynx in voiced and voiceless stops.

The cues that usually correlate with vibration of the vocal folds are VOT (in

utterance-initial stops) and voicing during closure (in intervocalic stops). Differences in

the configuration of the larynx are typically manifested as lower f0 due to slacking of the

vocal folds and lower F1 due to expansion of the vocal tract in voiced stops and,

consequently, higher f0 due to stiffening of the vocal folds and higher F1 due to shorter

vocal tract in voiceless stops. In addition, voiced stops have shorter closure duration and

longer duration of the preceding vowel.

2. The effect of speaking rate can be a tool to identify an active phonological

feature. Studies of effects of speaking rate on voicing show that speakers change VOT,

one of the most important temporal acoustic correlates of voicing, as a function of

speaking rate. In languages with a [voice] contrast this change affects one member of the

category – negative VOT, i.e. prevoicing. In line with these studies, it is reasonable to

expect that all stops specified with the feature [voice] in Russian should exhibit the same

relationship between speaking rate and duration of voicing in all positions. Indeed,

laryngeal features specify not only initial stops but also intervocalic stops and stops in

stop clusters. In utterance-initial stops, active voicing is realized as negative VOT, with

excitation of the larynx occurring before the release of a stop. In intervocalic stops, this is

realized as voicing during closure, which ultimately means the same: vocal fold vibration

occurs during the closure of the stop.

VOT in utterance initial stops is used more often as diagnostic of laryngeal

contrasts because it unequivocally points to a laryngeal gesture: e.g. prevoicing or

aspiration. Compared with VOT, voicing during closure is more ambiguous as a cue

because it can reveal either an active voicing gesture, which is indicative of the feature

Page 42: Voicing and voice assimilation in Russian stops

22

22

[voice] in phonology, or passive voicing, which starts in a previous voiced segment and

continues during closure in the intersonorant environment that facilitates voicing (Jessen

and Ringen 2002, Beckman et al, in press). The latter case is usually found in aspirating

languages with the feature [spread glottis]. Passive voicing in “voiced” intervocalic stops

is typically manifested as broken voicing and it correlates with (short) positive VOT in

such stops in utterance-initial position. By contrast, active voicing in intervocalic stops is

manifested as fully voiced closure with very few cases of broken voicing (Ringen and

Kulikov, in press).

If the feature of contrast in Russian is [voice], effects of speaking rate

manipulation on voicing are expected in voiced stops, but not in voiceless stops. I also

predict that speakers will have longer voicing in non-initial voiced stops in slow speech,

but VOT and voicing during closure in voiceless stops will not be affected by changes in

speaking rate. Voiceless stops in Russian are not produced with voicing; they have

positive short lag VOT, and intervocalic stops also have a short voicing tail into closure.

If Russian voiceless stops are unspecified for a laryngeal feature, they are expected to

show no effect of speaking rate. Extending the existing model, I hypothesize that the

effect of rate will not be found for temporal acoustic cues in voiceless stops in any

positions.

This model makes clear predictions about assimilated stops in clusters and final

stops. If, as usually claimed, all word-final stops undergo final devoicing and are

produced as voiceless, no effect of speaking rate on voicing is expected in final stops for

either underlying voiced or voiceless stops. Stops in clusters are assimilated in voicing to

the following stop. Thus, I expect that voicing during closure in C1 stops will change as a

function of speaking rate according to the voicing properties of C2 stops, rather than the

underlying voicing of C1 stops.

This model also makes a prediction for “sonorant transparency” cases in Russian.

If voice assimilation through a sonorant is indeed a phonological process in Russian,

Page 43: Voicing and voice assimilation in Russian stops

23

23

voicing in allegedly assimilated C1 stops in obstruent-sonorant-obstruent clusters should

change in slow and fast speech, and a strong effect of the following obstruent is expected.

Absence of transparency to voice assimilation predicts that voicing in underlying voiced

C1 stops should change as a function of speaking rate, while voicing in underlying

voiceless C1 stops should not.

3. The contextual effect of sonorant type can explain variation in voicing duration

in different types of clusters. If duration of voicing in stops is contingent on sonority of

the following segment, the longest duration is expected in prevocalic stops. Duration is

expected to gradually decrease before less sonorous segments – sonorant consonants, and

decrease further on before voiced stops. It should be the shortest before voiceless

fricatives and stops. Shorter duration of voicing before sonorant consonants does not

necessarily mean phonological voice assimilation. It can be a regular phonetic process

caused by less favorable conditions for voicing in longer and less sonorous clusters. Only

effects of the following C2 obstruent on voicing of C1 stops should unambiguously

indicate voice assimilation. Thus, variation in voicing duration in C1 stops in obstruent-

sonorant-obstruent clusters, defined by some authors as “sonorant transparency”, i.e.

phonological voice assimilation through a sonorant, could be a phonetic artifact and

merely an effect of a less sonorous segment on voicing, with C1 stops still preserving

underlying voicing, rather than the effect of a C2 obstruent.

The next chapter investigates cues to voicing in regular cases of word-internal

stops. These results are later used to consider the controversial cases of voice assimilation

in Russian.

Page 44: Voicing and voice assimilation in Russian stops

24

24

CHAPTER III

EXPERIMENT 1: VOICING IN WORD-INTERNAL STOPS

3.1. Motivation for the study

Only a few studies have addressed the important questions about how the voicing

contrast in stops is realized in Russian. Although the role of VOT and closure voicing is

quite clear, there are a range of secondary cues (e.g. vowel duration, f0, or F1) that may

or may not be used. These cues may work differently in processes of devoicing and voice

assimilation.

Ringen and Kulikov (in press) investigated VOT in initial stops and duration of

voicing during closure in intervocalic voiced and voiceless stops. Their subjects were 14

speakers of Russian in St. Petersburg, who read the word list at a comfortable tempo. The

results show that very little overlapping in VOT between voiced and voiceless tokens

occurs in initial stops. The majority of voiced stops (98%) were produced with robust

prevoicing (M= –74 ms, SD=28); voiceless stops were produced with a short lag VOT

(M=23 ms, SD=8). Intervocalic voiced stops were produced as fully voiced in 98% of

cases. Intervocalic voiceless stops were voiceless, with a short voicing tail into closure

(23% of closure duration) and a positive short-lag VOT (M=22 ms, SD=5).

Robust voicing in intervocalic word-medial stops was found in Barry (1995), who

reports that speakers produced three patterns of voicing in voiced intervocalic stops in

Russian: voicing during the entire closure with no or little change in amplitude, voicing

during the entire closure with decreasing amplitude of vocal fold vibration, and

interrupted voicing during closure (partially devoiced stops). 95% of stops in the data

were produced as fully voiced, although some female speakers showed a higher rate of

devoicing: e.g. speaker EK partially devoiced 25% of her stops, and speaker IL partially

devoiced 8% of her stops.

Page 45: Voicing and voice assimilation in Russian stops

25

25

Barry (1988) compared duration of closure of intervocalic obstruents and final

obstruents,8 which undergo devoicing in Russian. She found that intervocalic voiced

stops were, on average, consistently shorter than voiceless stops (by 18% for [d] and 30%

for [g]). Barry (1988) argues that the voicing contrast is neutralized word-finally. No

significant difference in duration was found between underlying voiced and voiceless

final stops. Duration of voicing in final stops was not different in underlying voiced and

voiceless stops for most speakers, either; however, speaker 2 produced a 19 ms difference

in voicing between voiced and voiceless stops.

Other studies provide evidence that final devoicing in Russian is incomplete in

production (Pye 1986, Shrager 2006, Dmitrieva et al 2010). Pye (1986) reports that

duration of closure in underlying voiced stops is 9% shorter than in underlying voiceless

stops, and it is affected by place of articulation. The greatest difference (15%) was

observed in bilabial stops, but in coronal stops it was almost completely neutralized (2%).

Shrager (2006) found significant difference in the energy of burst between underlying

voiced and voiceless coronal stops. Matsui (2011) argues that speakers of Russian use

these phonetic differences to recover underlying forms in discrimination and

identification tasks.

It is less clear how important other acoustic cues are for the voicing contrast in

Russian. Studies of English reveal that fundamental frequency after the burst and the

frequency of the first formant are important cues to the laryngeal contrast (House and

Fairbanks 1953). Westbury (1983) showed that speakers of English often lower the

larynx when they produce lenis stops, but F1 frequency, which is indicative of lowering

of the larynx, is not always significantly lower, especially for initial stops. Since the

8 Barry (1988) and Barry (1995) examined both stops and fricatives. Only the results for stops are

discussed here.

Page 46: Voicing and voice assimilation in Russian stops

26

26

English contrast is clearly not the same as in Russian, it is not clear that these results have

relevance for Russian. There is no systematic data on these cues for Russian. Halle

(1959) reports that frequencies of the first formant in Russian tend to be lower after initial

voiced stops than after voiceless stops, but no statistical analysis was performed.

Acoustic properties of voicing in obstruent clusters in Russian are also

understudied. Burton and Robblee (1997) found that assimilation was not complete in

stops in prepositions, but they argue this was the effect of the clitic boundary. I am not

aware of research on voice assimilation in word-internal clusters in Russian. Therefore,

acoustic measurements of assimilated stops within a word are essential to establish

general parameters of voice assimilation in Russian.

Duration of a vowel preceding an obstruent was examined in Barry (1988). She

found that vowels were 16% longer before intervocalic voiced stops than before voiceless

stops. Before word-final stops, in contrast, no significant difference in vowel duration

was found; however, her chart (p.85) suggests that most speakers did produce some

difference. Pye (1986) reports that the difference in vowel duration before word-final

stops is the greatest for the pairs /p, b/ (36%) and /k, g/ (17%) but it is smaller (9%) for

coronal stops.

This brief review clearly shows that the acoustic data on Russian voicing are

incomplete. Thus, the goal for this experiment was to collect systematic data on the

voicing contrast in Russian and fill these gaps: (i) to establish acoustic parameters of the

phonological contrast in Russian voiced and voiceless stops in initial, intervocalic, and

final positions within a word, as well as in word-internal clusters, and (ii) to test the

effects of speaking rate and environment on acoustic cues for voicing. Four sets of tests

examined word-initial stops, word-medial stops, word-internal stops in obstruent clusters,

and word-final stops.

Page 47: Voicing and voice assimilation in Russian stops

27

27

3.2. Method

3.2.1. Participants

Fourteen native speakers of Russian, seven males and seven females, participated.

Their mean age was 19.0 years (SD=1.9; range: 18–25 years). They were monolingual

speakers9 who had grown up and resided in Russia and spoke educated Standard Russian.

The participants had no history of speech or hearing disorders. They were naïve as to the

purpose of the experiment and they were paid a standard hourly rate for their

participation in the study.

The data were collected during a field trip to Tambov, Russia. Russian speakers in

the United States were not used because studies (Caramazza et al 1973, Chang 2012,

Nielsen 2011, Sancier and Fowler 1997 among others) show that speakers are sensitive to

differences in VOT in languages that they use on a daily basis. Even a relatively short

period of time spent in a different country can trigger changes in the voicing properties of

sounds in a native language. After staying in the USA for several months, speakers of

Russian begin to pronounce voiced and voiceless sounds in their native language in a way

which is more similar to English sounds (Dmitrieva et al 2010).

3.2.2. Stimuli

The stimuli were words and phrases with voiced and voiceless stops in the four

environments: word-initial, word-medial in an intervocalic position, word-medial in a

stop-stop cluster, and word-final. The full list of target phrases is given in Appendix

A(1). As noted, the terms ‘voiced’ and ‘voiceless’ in this experiment refer to phonetic

voicing. To distinguish between underlying voicing in C1 stops and voicing resulting

9 They learned some English or German in middle and high school; however, the input was not

naturalistic, as the foreign language was taught by non-native speakers. None of the participants actively

spoke the foreign language on an everyday basis.

Page 48: Voicing and voice assimilation in Russian stops

28

28

from voice assimilation, the term ‘underlying’ is used to refer to underlying specification

for voicing and the term ‘assimilated’ is used for stops that have changed their voicing

properties due to voice assimilation: e.g. a C1 stop in a /tg/ cluster that was pronounced

as [dg] can be called ‘underlying voiceless’ or ‘assimilated voiced’. C2 stops that trigger

voice assimilation in clusters are always called ‘voiced’ or ‘voiceless’ as they do not

change their underlying specifications.

3.2.2.1. Word-initial stops

The list of words included six minimal and near-minimal pairs with underlying

voiced and voiceless initial stops at three places of articulation: bilabial, coronal (dental),

and velar. Stops occurred before a vowel and before a sonorant consonant [r]: e.g. talyj

‘melted’, travy ‘grasses’, darom ‘for free’, drama ‘drama’. The stress in all words was on

the first syllable.

3.2.2.2. Word-medial intervocalic stops

The list of words included six minimal and near-minimal pairs with underlying

voiced and voiceless stops at three places of articulation: bilabial, coronal (dental), and

velar. Stops occurred before a vowel and before a sonorant consonant [r]: e.g. zador

‘zest’, (dva) kadra ‘two frames’, motor ‘engine’, u teatra ‘at the theater’.

3.2.2.3. Word-medial stops in obstruent clusters

The list included two pairs of words with underlying voiced and voiceless coronal

stops before two suffixes: one with a voiced stop [b] and one with a voiceless stop [k]:

e.g. molot’ba ‘threshing’, gorod

’ba ‘fence’, u katka ‘at the skating rink’, u sadka ‘at the

cage’. The limited number of target words is due to lexical limitations: very few words in

Russian have a stop before the suffix -b-.

Page 49: Voicing and voice assimilation in Russian stops

29

29

3.2.2.4. Word-final stops

The list of target words included three minimal and near-minimal pairs with

underlying voiced and underlying voiceless stops at three places of articulation (bilabial,

coronal, and velar) in a word-final position: e.g. u vorot ‘at the gates’, narod ‘people’.

The stimuli for all four positions were compiled in one list and randomized to

mask the location of target segments. In addition to the 34 target phrases, 22 fillers were

added. These were words with voiced and voiceless fricatives in initial, medial, and final

positions and words with assorted clusters. The total number of phrases given to the

speakers was 56. The list was given to the participants as one set.

3.2.3. Procedure and measurements

The participants were asked to read the list of phrases in three conditions: as a list,

with a complete pause between phrases (“list” reading), within a carrier phrase Skaži

_____ ešče raz ‘Say ____ again’ at a comfortable tempo (“slow” reading), and within the

same carrier phrase at a fast tempo (“fast” reading). In the third condition the subjects

were instructed to pronounce phrases as fast as they could but not at the expense of

comprehensibility. They were permitted to correct themselves if they did not like their

reading, in which case the last reading was selected for analysis. They read the list of

phrases in each condition three times, but only the second and third readings were

recorded. The total number of target tokens was 2856 (34 target phrases x 3 speech

conditions x 2 readings x 14 speakers).

The speakers were digitally recorded in a quiet room using a one-point condenser

SHURE WH30XLR microphone connected to an M-Audio MobilePre USB soundcard

through an XLR interface. The microphone was placed 20 mm away from the right

corner of the mouth. The recording was made at 44,100 Hz and then downsampled at

22,050 Hz for acoustic analysis. A digital high-pass filter with a 70-Hz cutoff frequency

was applied to the speech waveform. The high-pass filter served to reduce oscillations

Page 50: Voicing and voice assimilation in Russian stops

30

30

resulting from room vibration as well as to suppress the microphone air blast artifact

associated with plosive speech productions.

Three tokens were discarded due to mispronunciation. Thus, 2517 tokens were

chosen for the analysis. The segments were manually marked for boundaries in PRAAT

(Boersma &Weenink 2011). Duration of target words was used as the means to compare

the differences in speech tempo. Longer duration was used as the indication of a slower

tempo.

Testing negative VOT (lead voice) in Russian is problematic for word-initial

stops in connected speech. According to Lisker and Abramson (1964), VOT is the

interval between the onset of vocal fold vibration and release of a stop. In prevoiced stops

VOT is negative, which means that vocal fold vibration starts before the release. In

connected speech, however, voicing in a voiced word-initial stop can either start during

the closure phase of stop articulation (negative VOT) or continue from the previous

voiced segment. In order to avoid continuous voicing from a previous segment, other

studies of effects of speaking rate on VOT (e.g. Kessinger and Blumstein 1997, Beckman

et al 2011) used a carrier phrase with a voiceless segment before a target stop: e.g. speak

in English, dites [dit] ‘say’ in French, or les ‘read’ in Swedish.

This solution cannot be used in Russian because a voiceless stop before a voiced

stop will assimilate and be produced as voiced, thus resulting in a voiced cluster with

voicing that continues from a previous segment. Kulikov (2010) reports that the majority

of obstruent clusters (85% in the slow condition and 98% in the fast condition) are fully

assimilated across a word boundary in connected speech. Thus, measuring word-initial

stops with negative VOT in fast speech is not possible. For word-initial stops in the two

connected speech conditions (fast and slow), this study measured voicing during closure

(between the offset of the preceding vowel and the release of the target stop) rather than

VOT. In the list condition (when target stops were utterance-initial), however, it was

Page 51: Voicing and voice assimilation in Russian stops

31

31

possible to measure negative VOT in initial stops and use this in order to establish the

parameters of the voicing contrast.

To investigate whether voice assimilation occurs in stop clusters, acoustic

measurements of the first stop in a cluster (C1) and the second stop in cluster (C2) were

performed. Closure duration and duration of voicing of the target stops were measured.

Voicing ratios (henceforth, VR) were then calculated as a ratio of duration of voicing to

closure duration.

Both the waveform and the spectrogram were used to set the stop boundaries

(Figure 1). The beginning of the stop closure was marked at the end of the second

formant structure, which typically coincides with a significant drop in amplitude of vocal

fold vibration (Jessen 1998). The end of the closure was marked at the beginning of the

release burst.

F0 and F1-F5 frequencies10 were taken 10 ms after the release of initial and

intervocalic stops, as well as 10 ms before a closure onset for intervocalic stops, C1 stops

in clusters, and final stops (see Figure 1).

The frequency of the first five formants was measured in two stages using the

procedure from McMurray and Jongman (2011). First, frequencies were automatically

extracted for all files using the Burg and SL algorithms with two different parameter sets

(one selected for men and one for women). Next, I examined plots of both formant tracks

10 Only results for F1 are discussed in the dissertation. The results for F4 and F5 were consistent

with the analysis of F1 and hence are not reported here. F2 and F3 were significantly lower in voiced

bilabial stops, and higher in voiced coronal and velar stops as compared to voiceless stops at the same

places of articulation, suggesting that voiced coronal and velar stops had a slightly different place of

articulation. These findings are consistent with Bolla (1981), who reports that [d] and [g] in Russian are

articulated with jaw retraction. Evidence from other languages suggests this is a cross-linguistic tendency.

Van Alphen and Smits (2004) observed the similar difference in place of articulation between [t] and [d] in

Dutch. E. Kurniawan (personal communication) pointed out to me that [t] in Sundanese is dental but [d] is

articulated closer to the alveolar area. Further tests showed that F2 and F3 had relatively weak effects of

underlying voicing, hence they are not discussed here.

Page 52: Voicing and voice assimilation in Russian stops

32

32

superimposed on the spectrogram to visually determine if either of the automatically

coded tracks was correct. If not, formant frequency values were entered by hand from the

spectrogram.

Figure 1. Examples of important acoustic measurements of voiced and voiceless stops (tokens (a) darom ‘for free’ and (b) motor ‘engine’, Speaker 2 (male), slow rate).

Previous studies of effects of speaking rate have shown that there is no difference

in VOT duration in the list reading and the slow rate condition in connected speech

(Kessinger and Blumstein 1997, Beckman et al 2011). The same tendency for voicing

duration in all positions except for intervocalic stops was found in this study as well.

When no difference between list and slow conditions was found, the data for the list

condition were dropped and only the results for the slow and fast conditions are reported.

Page 53: Voicing and voice assimilation in Russian stops

33

33

The results for all three speaking rate conditions are reported for word-medial stops,

where all three rate conditions were different.

3.3. Results I: Word-initial stops

A series of analyses were performed on acoustic measures of initial stops. First, I

examined the list condition in order to assess participants’ production of the voicing

contrast. Next, the effects of speaking rate and sonorant type were examined by looking

at the acoustic cues for voicing in word-initial (but utterance-medial) stops in slow and

fast speaking rates in connected speech. Finally, the most important acoustic correlates of

the voicing contrast were established. The sections below present the results of statistical

analyses. Only important main effects and interactions are discussed.

3.3.1. VOT in utterance-initial stops

First, VOT in initial stops was measured for all tokens in the list condition

(N=336). Table B1 (Appendix B) summarizes the measurements. The statistical analysis

tested the effects of underlying voicing (voiceless, voiced) and the following sonorant

(vowel, consonant) on VOT duration using a repeated measures ANOVA. Place of

articulation (bilabial, coronal, velar) was used as an additional within-subject factor, and

Gender (male, female) as a between-subject factor. Results are summarized in Figure 2.

This ANOVA test yielded a significant effect of underlying voicing

(F(1,12)=2014.4, p<0.0001). Voiced stops were produced with prevoicing; voiceless

stops were produced with short lag VOT. An effect of following sonorant was also

observed (F(1,12)=27.8, p<0.001), and it interacted with voicing (F(1,12)=20.6,

p<0.001). This was due to the fact that for underlying voiced stops prevoicing was

significantly shorter before a sonorant consonant than before a vowel, though both were

clearly prevoiced (Vowel: M=–95.9 ms, SD=18, M=–76 ms, SD=17; t(13)=5.15,

p<0.001). For voiceless stops, however, these were not different (Vowel: M=23ms,

SD=10; Consonant: M=24 ms, SD=10; t(13)=1.08, p=0.30).

Page 54: Voicing and voice assimilation in Russian stops

34

34

Figure 2. Effect of following sonorant (vowel/consonant) on VOT of (a) initial voiced and (b) voiceless stops, broken down by place of articulation. Henceforth, a * indicates a significant difference between adjacent columns.

The expected effect of place of articulation was also found (F(2,24)=33.0,

p<0.001), and it did not interact with other factors (p>0.1). Prevoicing in

bilabial (M= –88 ms, SD=17) and coronal stops (M=–94 ms, SD=14) was longer than

in velar stops (M=–76 ms, SD=22). Positive VOT was shorter in bilabial (M=16 ms,

SD=3) and coronal stops (19 ms, SD=5), and longer in velar stops (M=34 ms, SD=5).

No main effect of gender on VOT duration was obtained (F(1,12)=1.84, p=0.22),

but gender interacted with voice (F(1,12)=7.79, p<0.05). Men produced slightly longer

prevoicing in underlying voiced stops than did women (Male: M= –91 ms, SD=16;

Female: M= –81 ms, SD=17; F(1,12)=4.53, p=0.055). No gender differences were

observed in VOTs of underlying voiceless stops (F(1,12)=3.35, p=0.092).

While the main effect of voicing on VOT was highly significant, it does not

address whether there was any overlap in the distributions. This is important to evaluate

whether this cue is an unambiguous marker of underlying voicing, or if there is too much

overlap for it to be used by itself. The distribution was thus computed of VOT values in

10 ms bins centered at 0, 10, 20 etc. (Figure 3).

* *

*

-120

-100

-80

-60

-40

-20

0Bilabial Coronal Velar

VO

T (

ms)

a.

Vowel

Consonant0

20

40

60

80

100

Bilabial Coronal Velar

VO

T (

ms)

b. Vowel

Consonant

Page 55: Voicing and voice assimilation in Russian stops

35

35

Figure 3. VOT distributions of Russian voiced and voiceless stops in the list condition.

VOT distributions showed very little overlap between the two categories. Overall,

98.5% of voiced stops were produced with robust prevoicing (M=–87 ms, SD=25). Only

1.5% of underlying voiced stops (all velars) were pronounced with short-lag VOT. All

voiceless stops were produced as voiceless unaspirated with a short lag VOT (M=24 ms,

SD=10).

The results suggest that the speakers produced the laryngeal contrast in initial

stops as expected. Voices stops were predominantly prevoiced, voiceless stops had short-

lag VOT. In addition, the analysis revealed that speakers produced shorter VOT before

sonorant consonants in voiced stops.

3.3.2. Initial stops in connected speech

The goal of the next analysis was to investigate acoustic cues for voicing in word-

initial (but sentence-medial) stops in connected speech. I analyzed 672 tokens produced

in the slow and fast rate conditions. The majority of voiced stops (96.7%) were produced

as fully voiced during the closure; thus, voicing during closure rather than VOT was

measured and analyzed as a cue only for voiced stops and VOT was measured and

analyzed only for voiceless stops. F0 and F1 were measured for both voiced and

voiceless categories.

0

20

40

60

80

100

-160 -140 -120 -100 -80 -60 -40 -20 0 20 40 60

# o

f T

ok

ens

VOT bin (ms)

Voiceless

Voiced

Page 56: Voicing and voice assimilation in Russian stops

36

36

3.3.2.1. Rate and word duration

The first set of tests examined whether the manipulation of speaking rate had its

intended effect. The entire word length11 was used as a proxy for speaking rate; and

shorter word duration was expected with faster reading. A repeated measures ANOVA

was used to evaluate (within-subjects) effects of speaking rate conditions (slow, fast), and

stop voicing type (voiceless, voiced) on word duration. The results are summarized in

Figure 4.

Figure 4. Effects of speaking rate and voicing on word duration.

A highly significant main effect of speaking rate was found (F(1,13)=135.6,

p<0.0001). Speakers pronounced words in the fast condition at an average of 283 ms

(SD=25), but in the slow condition they pronounced them over 100 ms slower with

average duration of 407 ms (SD=42). A significant main effect of Voicing type was also

found (F(1,13)=114.4, p<0.001): words with initial voiced stops had on average shorter

11 Duration of words with both voiced and voiceless initial stops was measured from the release

point. Another approach was, of course, to include prevoicing into word duration for cases with initial

voiced stops. Almost identical results were obtained using both approaches: duration of target words was

30% shorter in fast speech than in slow speech.

*

*

0

100

200

300

400

500

Slow Fast

Du

rati

on

(m

s)

Speaking Rate

Voiced

Voiceless

Page 57: Voicing and voice assimilation in Russian stops

37

37

duration (M=328 ms, SD=30) than words with voiceless stops (M=362 ms, SD=35), but

this did not interact with voicing (F(1,13)=1.0, p=0.332). This effect is intriguing as it

raises the possibility that listeners could be using word duration as a secondary cue for

voicing (c.f., Toscano and McMurray 2012, for perceptual studies suggesting this).

Phonetic studies of English (Allen and Miller 1999) also support this, though they

intriguingly show the opposite relationship with voicing (shorter duration for words

beginning with aspirated stops).

3.3.2.2. Acoustic cues for voicing in connected speech

The next analysis examined the word-initial (but sentence-medial) stops to

determine the range of cues that may support voicing discrimination in Russian. In

particular, I examined VOT (for voiceless sounds), closure voicing, f0 and F1

(summaries of these measurements can be found in Tables B2-4 in Appendix B). Each of

these acoustic cues was tested for effects of underlying voicing (voiced, voiceless),

speaking rate (slow, fast), sonorant type (vowel, consonant), and place of articulation

(bilabial, coronal, velar) using separate repeated measures ANOVAs for each cue. The

results of all of these ANOVAs are summarized in Table 1, and the most important

findings are described below.

For positive VOT in voiceless stops, no effect of speaking rate or sonorant type

was found. Only place of articulation affected VOT. On average, velars had longer VOT

(M=31.5 ms, SD=4.3) than bilabials (M=15.2, SD=3.1) or coronals (M=19 ms, SD=5),

all significant differences.

Voicing during closure in voiced stops showed a strong main effect of speaking

rate. Voiced stops were pronounced with shorter voicing in fast speech (M=–65.9 ms,

SD=12) than in slow speech (M=–85.8 ms, SD=16). Sonorant type also had a significant

effect on voicing. Voiced stops had longer voicing when they occurred before a vowel

(M=81.6 ms, SD=14) than before a sonorant consonant (M=71.2 ms, SD=14). An

Page 58: Voicing and voice assimilation in Russian stops

38

38

interaction with speaking rate revealed that voicing in voiced stops was longer in slow

speech before vowels (25.1 ms) than before sonorant consonants (16.5 ms), although both

differences were significant (Vowel: t(13)=8.02, p<0.001; Consonant: t(13)=10.4,

p<0.001). Finally, place of articulation also affected duration of voicing: bilabial stops

had longer voicing (M=85.9 ms, SD=12) than coronals (M=75.6 ms, SD=15) or velars

(M=68.8 ms, SD=15), all significant differences (Coronal vs. bilabial: t(13)=–3.64,

p<0.01; Coronal vs. velar: t(13)=2.52, p<0.05).

Table 1. Summary of ANOVAs examining effects of underlying voicing (2 levels), speaking rate (2 levels), sonorant type (2 levels), and place of articulation (3 levels) on acoustic cues in word-initial stops.

Cues

Effects (df) VOT

(voiceless

stops)

Voicing

during closure

(voiced stops)

f0 F1

Underlying voice (1,13) 17.4**

51.7***

Rate (1,13) 2.30 72.27

*** 10.3

** <1

Sonorant (1,13) 2.96 42.66***

<1 52.6***

Rate Sonorant (1,13) 3.50 6.03* <1 2.26

Underlying voice Sonorant

(1,13) <1 6.18

*

Place (2,26) 196.42***

18.52***

3.19 7.01**

Underlying voice Rate (1,13) <1 <1

Underlying voice Place (2,26) 1.01 18.52***

Place Sonorant (2,26) 1.58 4.47* <1 3.45

*

Place Rate (2,26) 1.89 <1 <1 <1

Note: F values are shown (significant values are given in bold). Effects and interactions that are less important for the goals of this study are shown in the lower part of the table under the double line.

* p < 0.05, ** p < 0.01, *** p < 0.001.

Fundamental frequency after the release of the initial stop showed an effect of

underlying voicing. On average, f0 was 8 Hz higher after voiceless stops than after

Page 59: Voicing and voice assimilation in Russian stops

39

39

voiced stops. Speaking rate significantly affected f0, with higher f0 in the fast condition

(M=189 Hz, SD=66) than in the slow condition (178 Hz, SD=62). This did not interact

with voicing. Sonorant type and place of articulation did not affect f0.

Finally, F1 was affected by underlying voicing: it was significantly lower after

voiced stops (M=476 Hz, SD=74) than after voiceless stops (M=532 Hz, SD=100),

revealing the tendency to lower the larynx to facilitate production of voiced stops. In

addition, place and sonorant type affected F1, revealing differences due to the positions

of articulators in the oral cavity. No effect of rate was found for F1: the shape of the

vocal tract did not change as a function of speaking rate.

3.3.2.3. Distributions of VOT and voicing during closure in

slow and fast speech

Again, the distributions of voicing and VOT in initial (but sentence medial) stops

were analyzed to determine whether these distributions change in response to speaking

rate. A histogram constructed with 10 ms bins centered at 0, 10, 20, etc. showed that 325

(96.7%) word-initial voiced stops in sentence-medial position were produced as fully

voiced. 11 voiced stops with broken voicing were produced by speakers 1, 4, 7, and 12

(all female). 2 such tokens (1.2%) were produced in slow speech, and 9 tokens (6.3%) in

fast speech. These tokens were voiced for 64% of their closure duration. All 336

voiceless tokens were produced with a voiceless closure and positive short-lag VOT.

The results of the analysis are consistent with the ANOVAs and show that

speaking rate affected only voicing in voiced stops (Figure 5a), but not VOT in voiceless

stops (Figure 5b). With the slower speaking rate, speakers produced voicing in voiced

stops up to 20.4 ms longer than at fast rate. When a stop was pronounced before a vowel,

the difference was even greater: 25.2 ms. In slow speech, the range for voiced stops was

125 ms (Min.= –164 ms, Max.= –39 ms). When speakers switched to fast speech, all

voiced stops were produced with shorter voicing. The range of voicing distributions

Page 60: Voicing and voice assimilation in Russian stops

40

40

shrank to 78 ms (Min.= –109 ms, Max.= –31 ms). The mode also changed from 95 ms

(slow speech) to 75 ms (fast speech), suggesting a shift in the entire distribution.

Figure 5. Changes in (a) voicing for word-initial (intervocalic) voiced (round markers) and in (b) VOTs for voiceless (square markers) stops in slow (light markers) and fast (dark markers) speaking rate conditions.

The distribution of VOTs of voiceless stops, in contrast, did not change as a

function of speaking rate. The range was 38 ms (Min.=7 ms; Max.=46 ms) in slow

speech, and it was 40 ms in fast speech, remaining almost within the same boundaries

(Min.=5 ms, Max.=46 ms). The mode was 14 ms in both speaking rate conditions.

3.3.3. Word-initial stops in connected speech: Examining

variability in cues

The results of the prior analyses show that voiced and voiceless initial stops vary

along several acoustic parameters: presence/absence of the vocal fold vibration

(prevoicing or VOT), f0, and formant frequencies of the following vowel at the release

point. All cues12 yielded significant effects of underlying voicing. These ANOVAs,

12 VOT was not tested for the effect of underlying voicing because it was measured only in

voiceless stops.

0%

10%

20%

30%

40%

50%

160150140130120110100 90 80 70 60 50 40 30 20 10 0

Fre

qu

ency

(%

)

Duration bin (ms)

a. Voicing Slow rate

Fast rate

0%

10%

20%

30%

40%

50%

0 10 20 30 40 50 60

Duration bin (ms)

b. VOT Slow rate

Fast rate

Page 61: Voicing and voice assimilation in Russian stops

41

41

however, only show that these cues differ with respect to voicing and leave two questions

unanswered. The first question was what the relative contributions of the different

sources of variance on each cue (e.g., place, voicing, speaker etc) was. In a sense this

allows us to ask what else affects a given cue and how much effect does it have relative

to the effect of voicing – in the extreme, this is a way of asking if any cues rise to the

level of invariance. This was investigated with a series of hierarchical regression

analyses. And the second question was how to determine how important each cue was

(relative to the others) for indicating the voicing contrast. This was done using a

computational formalism from Toscano and McMurray (2010).

The first analysis examined the association between a cue and several factors

including both underlying voicing and contextual factors. A series of hierarchical

regression analyses were performed in which a single cue was the dependent variable,

and sets of dummy codes for each affecting factor were independent variables. In each

regression, dummy codes for speakers (13), rate (2), sonorant type (1), place of

articulation (2) were added to the model on separate steps, and the total contribution of a

factor (e.g., speaker) was evaluated using R2

change. The code for underlying voicing was

added to the model last, to determine how much variance voicing accounts for over and

above all other contextual factors. All individual cues were used as described above with

one exception: since VOT can only be measured in voiceless stops, for the purposes of

this analysis VOT was collapsed with voicing during closure into a new cue, “Voice

before/after release”. This cue indicates whether vibration of the vocal folds occurred

before the release or started after it13. Table 2 summarizes the results of the regression

analyses.

13 This cue is functionally equivalent to VOT in utterance-initial stops but calling it “voice

before/after release” helps avoid a terminological issue with the definition of VOT. Recall that voicing in

voiced intervocalic stops cannot be technically called “negative VOT” as voicing in this case continues

from a previous segment. Nevertheless, it is still “prevoicing” with respect to the release point.

Page 62: Voicing and voice assimilation in Russian stops

42

42

Table 2. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), sonorant type (2 levels), place of articulation (3 levels), and underlying voice (2 levels) in word-initial stops.

Contextual factor

Cue Speaker Rate Sonorant Place

Underlying

voice

df 13,658 1,657 1,656 2,654 1,653

Voice before/after

release 0.011* 0.020* 0.874***

f0_post 0.947*** 0.009*** 0.004***

F1_post 0.235*** 0.215*** 0.017*** 0.091***

Note: R2

change values are shown. Missing values were not significant (p<0.05). R2

values for invariant cues for voicing are given in bold.

* p < 0.01,

*** p < 0.001.

All cues were affected by underlying voicing (which is to be expected given the

prior ANOVAs). The effect sizes (after Cohen and Cohen 1983) varied from small

(R2<0.05) to large (R

2>0.15), but they were highly significant. Next, the question of

whether a certain cue for voicing is context-dependent (i.e. correlates with speakers, rate,

or sonorant type) or invariant was addressed. A criterion used in McMurray and Jongman

(2011) was adopted: a cue is invariant if it had a large effect of voicing and small effects

of context. Only Voice before/after release met the definition of an invariant cue and

highly correlated with underlying voicing (R2=0.874) and had minimal or no relationship

with the other factors. F0 had only a small relationship to underlying voicing (R2=0.004)

and it was largely affected by context (R2=0.956). F1 moderately related to underlying

voicing (R2=0.091) and it was largely affected by context (R

2=0.465).

Thus, presence or absence of vocal fold vibration during closure has the strongest

association with the voicing contrast. It does not largely depend on speakers, rate,

environment, or place of articulation of initial stops in connected speech.

Page 63: Voicing and voice assimilation in Russian stops

43

43

The second analysis asked about the relative value of these cues for

discriminating voiced and voiceless tokens. Weights for each cue were calculated using

the approach developed in Toscano and McMurray (2010). A single weight indicates how

useful a cue is for discriminating two categories. It determines reliability of a cue as a

function of the distance between the two categories and variance within these categories.

This relationship is captured in a formula:

(21) W=( )

where μ1 and μ2 are the means of each category (e.g. voiced and voiceless) and σ1 and σ2

are their standard deviations. For two overlapping distributions, the greater weight

indicates that a cue more reliably discriminates the two categories. This measure ignores

context effects, and assumes that listeners compute reliability in a cue without the benefit

of any potential normalization, but serves as a good first approximation of how useful a

cue is likely to be. The weights for each cue are summarized in Table 3.

Table 3. Summary of important acoustic cues as predictors of voicing in word-initial stops, pooled across all rates and contexts.

Cue Voiceless Voiced Weight

Mean (SD) Mean (SD)

Voice before/after release (ms) 22.5 (9) -82.7 (30) .377

f0_post (Hz) 185 (63) 176 (60) .002

F1_post (Hz) 532 (100) 476 (74) .008

Note: Weight for each cue estimates reliability to predict the voicing category (voiced vs. voiceless). The best predictors are shown in bold.

Not surprisingly, the best predictor for speakers’ production of a stop as voiced or

voiceless was voice before/after release (W=.377). This is consistent with the results of

the previous test, which showed that this cue was invariant and had the strongest

association with underlying voicing. Other cues were substantially smaller and their

Page 64: Voicing and voice assimilation in Russian stops

44

44

combined weight is much smaller (Wcom=.010) than that of the vocal fold vibration. Low

reliability of these cues is explained by either small mean difference (e.g. f0) or great

contextual variability (e.g. F1).

The results suggest that the most salient acoustic parameter which ensures the

phonological contrast in initial stops is voicing during closure. Although it is easier for

speakers to cease the vibration of vocal folds or to sustain it for only a part of a closure,

they did not do this and instead maintained voicing during entire closure in 96.7 % of

tokens. Presence of voicing during closure is strongly associated with underlying voicing

and is a very reliable predictor of voicing. It is the only parameter that changed as a

function of the speaking rate manipulation. Significant lengthening in voicing duration in

slower speech was found only in voiced stops; temporal cues in voiceless stops were not

affected by speaking rate. Speakers apparently actively targeted the voicing gesture and

maintained it in all speaking rate conditions.

3.4. Results II: Word-medial stops

The next series of analyses was performed on acoustic measures of voiced and

voiceless stops in word-medial intervocalic position to determine the range of cues that

may support voicing discrimination in Russian. Using the procedure described in section

3.3, the effects of speaking rate and sonorant type on word-medial stops were examined

by looking at VOT (in underlying voiceless stops), closure duration, duration of voicing

during closure, voicing ratio, duration of a preceding vowel, and phonation (f0 and F1) of

preceding (henceforth, f0_pre and F1_pre) and following (henceforth, f0_post and

F1_post) vowels in the list, slow, and fast conditions. 1007 tokens were selected for the

analysis. Preliminary analysis showed that duration of voicing was different in all

speaking rate conditions; thus, results for all three conditions are reported here. No

difference in duration of voicing between men and women was observed (F<1).

Page 65: Voicing and voice assimilation in Russian stops

45

45

Table 4. Summary of ANOVAs examining effects of underlying voicing (2 levels), rate (3 levels), sonorant type (2 levels), and place of articulation (3 levels) for word-medial stops.

Cues

Effects

(df)

VOT Voicing

duration

Closure

duration

Vowel

duration

f0_

pre

F1_

pre

f0_

post

F1_

post

Voice

(1,13)

697***

245***

85***

3.0 178***

23**

27***

Rate (2,26) 1.8 41***

83***

74***

<1 2.2 3.5 <1

Voice

Rate (2,26)

62***

60***

7.7**

<1 <1 <1 1.7

Sonorant

(1,13) 19

** 143

*** 179

*** 237

*** 2.9 3.1 <1 24

***

Voice

Sonorant

(1,13)

48***

5.4* <1 <1 21

*** 1.1 <1

Rate

Sonorant

(2,26)

3.0 22***

54***

56***

1.3 <1 1.6 7.8*

Place

(2,26) 72

*** 41

*** 39

*** 23

*** 3.1 93

*** 3.5 9.3

**

Voice

Place

(2,26)

7.5**

3.0 1.5 3.3 19***

2.1 <1

Place

Sonorant

(2,26)

37***

4.6* 7.3

** 4.6

* 3.8

* 86

*** 1.7 2.0

Place

Rate (4.52)

<1 2.8 9.6**

<1 4.9* 2.6 6.2

** 2.0

Note: F values are shown; significant values are given in bold. Effects and interactions that are less important for the goals of the study are shown in the lower part of the table under the double line.

* p < 0.05;

** p < 0.01,

*** p < 0.001.

The results of the acoustic measurements are summarized in Tables B5-11 in

Appendix B. Each of the cues14 was tested using separate repeated measures ANOVAs

14 VOT was not tested for the effect of underlying voicing as this measurement was taken only in

voiceless stops.

Page 66: Voicing and voice assimilation in Russian stops

46

46

with underlying voicing (voiced, voiceless), speaking rate (list, slow, fast), sonorant type

(vowel, consonant), and place of articulation (bilabial, coronal, velar) as factors. The

results of these ANOVAs are summarized in Table 4.

In addition to the effects of underlying voicing, speaking rate, and sonorant type,

which are described in the following sections, place of articulation affected all cues

except f0. Bilabial stops had longer closure (M=81 ms, SD=11) than coronal stops (M=68

ms, SD=13) and velar stops (M=66 ms, SD=9), while coronals and velars were not

significantly different from each other (t(13)=0.9, p=0.353). The same tendency was

observed in duration of voicing in voiced stops, and it inversely affected duration of a

preceding vowel. Place also affected VOT in voiceless stops: velar stops had significantly

longer VOT (M=34 ms, SD=6) than bilabial (M=21 ms, SD=5) and coronal stops (M=21

ms, SD=5).

3.4.1. Effect of underlying voicing

For all cues except f0_pre, the main effect of underlying voicing was found. Both

duration of voiced and voiceless stops, as well as duration and phonation of surrounding

vowels, were determined by the underlying voicing category. Voiceless stops had closure

duration averaging 89 ms (SD=28), which was 19 ms longer than for voiced stops (M=70

ms, SD=23). Voicing during closure averaged at 69 ms (SD=11) in voiced stops, and at

13 ms (SD=7) in voiceless stops. Vowels were longer before voiced stops (M=106 ms,

SD=13) and shorter before voiceless stops (M=94 ms, SD=11).

F0 was lower after voiced stops (M=182 Hz, SD=69) and higher after voiceless

stops (M=190 Hz, SD=74). F0 before voiced stops was also lower than before voiceless

stops (M=183 Hz, SD=67), but the difference (3 Hz) did not reach significance.

The results for the F1 frequencies were similar to those obtained for initial stops.

F1 was significantly lower when the vowel occurred before (M=538 Hz, SD=102) and

Page 67: Voicing and voice assimilation in Russian stops

47

47

after (M=410 Hz, SD=49) voiced stops than before (M=559 Hz, SD=115) and after

(M=436 Hz, SD=66) voiceless stops, suggesting that lowering of the larynx had occurred.

3.4.2. Effect of speaking rate

3.4.2.1. Closure duration

Both voiced and voiceless stops were affected by rate, as shown in Figure 6(a).

Speakers produced longer stop closure in the list condition (M=94 ms, SD=28) and slow

condition (M=84 ms, SD=24), and shorter closure in the fast condition (M=60 ms,

SD=17); all differences were significant. The voicing rate interaction revealed that the

difference between voiced and voiceless stops was significantly greater in the list (24 ms)

and slow (21 ms) conditions than in the fast condition (11 ms).

Figure 6. Effects of speaking rate on (a) closure duration and (b) voicing duration of voiceless and voiced word-medial stops.

3.4.2.2. Voicing during closure

Figure 6(b) summarizes the results of the analysis for voicing duration. A

significant voicing rate interaction was also obtained for voicing during closure. This

was due to the fact that differences in voicing during closure in different rate conditions

* *

*

0

20

40

60

80

100

120

List Slow Fast

Du

rati

on

(m

s)

a. Closure Voiceless

Voiced

0

20

40

60

80

100

List Slow Fast

Du

rati

on

(m

s)

b. Voicing Voiceless

Voiced

Page 68: Voicing and voice assimilation in Russian stops

48

48

were found only in voiced stops (List: M=80 ms, SD=16; Slow: M=72 ms, SD=13; Fast:

M=53 ms, SD=10; F(2,26)=54.1, p<0.001). Voicing in voiceless stops did not change

significantly (F<1) as a function of a speaking rate and averaged 13 ms (SD=9).

3.4.2.3. VOT

The results of the analysis for VOT are shown in Figure 7(a). For VOT in

voiceless stops, no effect of speaking rate was obtained. VOT averaged 26 ms (SD=10) in

the list and slow conditions and 25 ms (SD=9) in the fast condition.

Figure 7. Effect of speaking rate on (a) VOT of voiceless word-medial stops and (b) duration of a preceding vowel.

3.4.2.4. Preceding vowel

A voicing rate interaction revealed (see Figure 7b) that vowel length was

significantly different in all rate conditions, averaging 122 ms (SD=41) in the list

condition, at 104 ms (SD=28) in the slow condition, and 75 ms (SD=16) in the fast

condition, but the difference in duration before voiced and voiceless stops was greater in

the list condition (16 ms), and smaller in the slow (11 ms) and fast (9 ms) conditions, all

significant differences (t(13)=7.16, p<0.001; t(13)=7.89, p<0.001; t(13)=8.45, p<0.001).

0

10

20

30

40

50

Bilabial Coronal Velar

Du

rati

on

(m

s)

a. VOT List

Slow

Fast * *

*

0

40

80

120

160

List Slow Fast

Du

rati

on

(m

s)

b. Vowel Voiceless

Voiced

Page 69: Voicing and voice assimilation in Russian stops

49

49

3.4.2.5. Phonation of surrounding vowels

An effect of rate was not obtained for f0 and F1 frequencies either before or after

a target stop, suggesting phonation of vowels did not change significantly across

speaking rates.

3.4.3. Distributions of voicing during closure and VOT

Next, changes in distributions of voicing during closure and VOT in word-medial

stops were examined. Recall that for word-initial (but sentence-medial) stops in

intervocalic position such changes were observed only in voiced stops. Thus, it was

important to evaluate whether similar changes in duration of voicing occur in word-

medial voiced stops as well. Thus, two sets of distributions of voicing and VOT values

were computed in 10 ms bins centered at 0, 10, 20, etc. for slow and fast conditions, as

shown in Figure 8.

Figure 8. Distributions of durations of voicing for voiceless (square markers) and voiced (round markers) stops, and VOTs for voiceless stops (triangle markers) in slow (light markers) and fast (dark markers) speaking rate conditions.

0%

10%

20%

30%

40%

50%

60%

0 10 20 30 40 50 60 70 80 90 100110120130 0 10 20 30 40 50 60

Fre

qu

ency

(%

)

Duration bin (ms)

Voiceless-Slow

Voiceless-Fast

Voiced-Slow

Voiced-Fast

VOT-Slow

VOT-Fast

Release

VOT Voicing during closure

Page 70: Voicing and voice assimilation in Russian stops

50

50

For word-medial intervocalic voiced stops, distributions of voicing during closure

were spread along the time continuum to a much greater extent than distribution of

voicing during closure in voiceless stops. This is expected, since voiced stops tended to

be completely voiced during closure, with 92.5% of all tokens produced as fully voiced,

as shown in Table B9. This number did not significantly change in the list, slow and fast

conditions: 92%, 93% and 92% respectively. Voiceless stops were voiceless during

closure (see Table B10), with a short voicing tail into closure and a positive short lag

VOT.

For voiced stops, the range in slow speech was 115 ms (Min.=13 ms, Max.=128

ms). When speakers switched to fast speech, all voiced stops were produced with shorter

voicing. The range was 93 ms (Min.=0 ms, Max.=93 ms). The mode also changed from

69 ms (slow speech) to 53 ms (fast speech). Therefore, the entire distribution shifted as

the speaking rate changed from slow to fast.

No such change occurred in the distribution of voicing tails in voiceless stops.

The range of voicing in slow speech was considerably shorter: 34 ms (Min.=0 ms,

Max.=34 ms). In fast speech, the range showed slight spreading due to several outliers:

42 ms (Min.=0 ms, Max.=42 ms). The mode was 0 ms in both rate conditions.

No significant change in the distribution was observed for VOTs of voiceless

stops, either. The range for the entire distribution was 41 ms (Min.=9 ms, Max.=51 ms)

in slow speech and 46 ms (Min.=9 ms, Max.=55 ms) in fast speech. The mode was 22 ms

in slow and 14 ms in fast speech. These results were similar to those found for word-

initial stops, suggesting that changes in voicing during closure and no change in VOT as

a function of speaking rate are consistent in word-initial and word-medial positions in

connected speech.

Page 71: Voicing and voice assimilation in Russian stops

51

51

3.4.4. Effect of sonorant type

A main effect of sonorant type was found for all cues except f0. Closure duration

was shorter before [r] (M=66, SD=19) than before a vowel (M=93 ms, SD=28). These

differences in duration inversely affected duration of a preceding vowel, which was

longer before a stop followed by [r] (M=109 ms, SD=40) than before a prevocalic stop

(M=83 ms, SD=20) (see Figure 9).

Figure 9. Effects of sonorant type on (a) closure duration and (b) duration of a preceding vowel for voiceless and voiced word-medial stops.

For voicing during closure, sonorant type voicing (F(1,13)=62.5, p<0.001) and

sonorant type voicing rate interactions (F(2,26)=50.68, p<0.001) were obtained. This

was due to the fact that voicing was significantly longer before a vowel than before a

consonant only in voiced stops (Figure 10a). Duration of voicing changed as a function of

speaking rate only when voiced stops occurred before a vowel (Figure 10b). No effect of

sonorant type on voicing during closure was obtained for voiceless stops (F<1).

The effect of sonorant type on formant frequencies simply indicates that F1

differed before a vowel and [r] due to the different position of articulators. For f0, which

reflects the configuration of the glottis rather than the shape of the vocal tract, no effect

of sonorant type was observed.

*

*

0

30

60

90

120

Voiceless Voiced

Du

rati

on

(m

s)

a. Closure Vowel

Consonant

* *

0

30

60

90

120

150

Voiceless Voiced

Du

rati

on

(m

s)

b. Vowel Vowel

Consonant

Page 72: Voicing and voice assimilation in Russian stops

52

52

Figure 10. (a) Sonorant voicing interaction and (b) sonorant rate interaction for duration of voicing in word-medial voiced stops.

3.4.5. Word-medial stops: Examining variability in cues

The results of the prior analyses show that voicing in word-medial stops varies

along several acoustic parameters: presence/absence of the vocal fold vibration during

closure, closure duration, duration of a preceding vowel, and phonation (f0, F1) of

preceding and following vowels. All cues except f0 before closure onset yielded

significant effects of underlying voicing. These ANOVAs, however, only show that these

cues differ with respect to voicing. A final test was performed to determine how

important was each cue (relative to the others) for indicating the voicing contrast.

Using the procedure applied earlier to cues for initial stops, the first step was to

examine association between each cue and several factors including both underlying

voicing and contextual factors was. A series of hierarchical regression analyses were

performed, in which a single cue was the dependent variable, and sets of dummy codes

for each affecting factor were independent variables. In each regression, sets of dummy

codes for speakers (13), rate (2), sonorant type (1), and place of articulation (2) were

added to the model on separate steps, and the total contribution of a factor (e.g., rate) was

evaluating using R2

change. The code for underlying voicing was added to the model last to

determine how much variance voicing accounts for over and above all other contextual

*

0

20

40

60

80

100

Voiceless Voiced

Du

rati

on

(m

s)

a. Sonorant Voicing Vowel

Consonant

* * *

0

20

40

60

80

100

120

List Slow Fast

Du

rati

on

(m

s)

b. Sonorant Rate (voiced) Vowel

Consonant

Page 73: Voicing and voice assimilation in Russian stops

53

53

factors. Because VOT was measured only in voiceless stops, another cue – “Voice

before/after release” – was evaluated instead. This cue includes voicing during closure

that was not broken before the release in voiced stops and positive VOT in voiceless

stops. It indicates whether vibration of the vocal folds occurred before the release or

started after it. Table 5 summarizes the results of the regression analyses.

Table 5. Summary of regression analyses examining effects of speaker (14 levels), rate (3 levels), sonorant type (2 levels), place (3 levels), and underlying voice (2 levels) in word-medial stops.

Contextual factor

Cue Speaker Rate Sonorant Place

Underlying

voice

df 13,993 2,991 1,990 2,988 1,987

Voice before/after release 0.011* 0.012* 0.016*** 0.875***

Voicing during closure 0.023*** 0.034*** 0.022*** 0.715***

Closure duration 0.064* 0.271*** 0.235*** 0.061*** 0.116***

Vowel duration 0.120* 0.324*** 0.263*** 0.016*** 0.013***

f0_pre 0.833* 0.018*** 0.001***

F1_pre 0.236* 0.291*** 0.038***

f0_post 0.871* 0.028*** 0.004***

F1_post 0.201* 0.017*** 0.042*** 0.033***

Note: R2

change values are shown. Missing values were not significant (p<0.05). R2

values for invariant cues for voicing are given in bold.

* p < 0.01; *** p < 0.001.

All cues were affected by underlying voicing (which is to be expected given the

prior ANOVAs). The effect sizes (after Cohen and Cohen 1983) varied from small

(R2<0.05) to large (R

2>0.15), but were all highly significant. Next, the question of

whether a certain cue for voicing is context-dependent (correlates with speakers, rate, or

sonorant type) or invariant was addressed. Only two cues – Voice before/after release

(R2=0.875), and Voicing during closure (R

2=0.715), – met the definition of an invariant

Page 74: Voicing and voice assimilation in Russian stops

54

54

cue and highly correlated with voicing and had minimal or no relationship with the other

factors. Closure duration only moderately correlated with voicing (R2=0.116) and it was

largely affected by context (R2=0.57). Other cues had a small relationship to underlying

voicing (R2<0.05) and they were largely affected by context. Thus, presence or absence

of vocal fold vibration during closure has the strongest association with the voicing

contrast.

Finally, the cues were examined for their relative value for discriminating voiced

and voiceless tokens. The same procedure as in section 3.3.3 was used. Weights for each

cue were calculated using the formula W=( )

,

where μ1 and μ2 are the means for each category (e.g. voiced and voiceless) and σ1 and σ2

are their standard deviations. A summary is given in Table 6.

Table 6. Summary of important acoustic cues as predictors of voicing in word-medial intervocalic stops, pooled across all rates and contexts.

Cue Voiceless Voiced Weight

Mean (SD) Mean (SD)

Voice before/after release (ms) 25.3 (10) -69.7 (23) .419

Voicing during closure (ms) 13.2 (9) 68.6 (23) .270

Closure duration (ms) 88.6 (28) 69.9 (23) .029

Vowel duration (ms) 97.7 (38) 106 (35) .007

f0_pre (Hz) 181 (68) 177 (65) .001

F1_pre (Hz) 580 (115) 538 (102) .004

f0_post (Hz) 186 (69) 175 (64) .002

F1_post (Hz) 434 (78) 410 (62) .005

Note: Weight for each cue estimates reliability to predict the voicing category (voiced vs. voiceless). The best predictors are shown in bold.

Not surprisingly, the best predictors for speakers’ intention to produce a stop as

voiced or voiceless were Voice before/after release (W=.419) and Voice during closure

(W=.270). This is consistent with the results of the previous test, which showed that these

Page 75: Voicing and voice assimilation in Russian stops

55

55

cues were invariant and had the strongest association with underlying voicing. Other cues

were not found to be reliable predictors of voicing. Their combined weight is smaller

(Wcom=.048) than the weight of the vocal fold vibration. Low reliability of these cues is

explained by either small mean difference (e.g. vowel duration, f0, F1) or great

contextual variability (e.g. closure duration, f0, F1).

The results for word-medial stops were consistent with the those for word-initial

stops. They suggest that the most salient acoustic parameter which ensures the

phonological contrast in intervocalic stops is voicing during closure. Speakers maintained

voicing during the entire closure in 92.5 % of the tokens although it is generally easier to

cease the vibration of vocal folds or to sustain it for only a part of a closure. Voicing

during closure is strongly associated with underlying voicing and is a very reliable

predictor of voicing. It is the only parameter that changed with the speaking rate

manipulation. Significant lengthening in voicing duration in slower speech was found

only in voiced stops; temporal cues in voiceless stops were not affected by speaking rate.

Speakers actively targeted the voicing gesture and maintained it in all speaking rate

conditions.

3.5. Results III: Clusters and voice assimilation

The next set of tests was performed to evaluate acoustic cues for voicing in stop

clusters. Both the first (C1) and the second (C2) tokens in a cluster were examined. As

C1 stops are expected to assimilate in stop clusters, the goal of the analysis was to

examine whether their acoustic properties were determined by their underlying voicing or

by voicing in the following C2 stops. 336 tokens were analyzed. The preliminary test

showed that voicing in the list and slow conditions was not significantly different

(t(13)=2.03, p=0.063). Therefore, the results for the list condition were dropped in favor

of the analysis of voice assimilation in connected speech. No difference in duration of

voicing between men and women was found (F<1).

Page 76: Voicing and voice assimilation in Russian stops

56

56

3.5.1. Voicing in C2

The first analysis examined voicing in the second consonants (C2) in a cluster.

The results are summarized in Table 7. For the purposes of the study it was important to

establish whether voicing in C2s was a constant parameter or it varied unpredictably. The

second situation would considerably complicate the analysis of voicing in C1 stops.

Table 7. Means and standard deviations (in brackets) for acoustic properties of C2 stops.

Voiced /b/ Voiceless /k/

VOT 31.6 ms (8)

Closure duration 79.4 ms (19) 61.6 ms (17)

Voicing duration 77.8 ms (21) 0 (0)

VR 98.2% (11) 0% (0)

Because only /b/ and /k/ were used in the stimuli, direct comparison of temporal

cues in these stops did not make sense. Such comparison would find the differences (e.g.

in closure duration) that exist due to place of articulation rather than to underlying

voicing. Prior tests showed that bilabial stops in Russian were inherently longer than

velar stops and this pattern was also found in C2 stops. Instead, the proportion of voicing

duration to closure duration (voicing ratio; henceforth, VR) of the two categories of stops

was compared across two speaking rate conditions.

C2 stops were produced consistent with their underlying voicing properties. All

underlying voiceless stops /k/ were pronounced as fully voiceless with positive VOT

averaging 32 ms. Underlying voiced stops /b/ were produced as voiced, 90% of the

tokens were voiced during entire closure, and the rest were evenly spread along the

continuum with VRs within the range between 0% and 95%.

Page 77: Voicing and voice assimilation in Russian stops

57

57

The voicing ratios of C2 stops did not change significantly due to the speaking

rate manipulation (F(1,13)=1.46, p=0.249). Underlying voiceless stops remained fully

voiceless (VR=0%) in both rate conditions, underlying voiced stops were voiced for the

most part of their closure and had VR of 98.1% in slow speech and 94.3% in fast speech.

3.5.2. C1 stops

The next analysis examined acoustic cues in C1 stops. Observation of waveforms

and spectrograms showed that the majority of C1 stops were released15. Only one token

(/p/ in the /pt/ cluster, Sp07) was unreleased. This token was excluded from the analyses

of closure and voicing duration. The acoustic measurements are summarized in Tables

B12-15 (Appendix B).

Table 8. Summary of ANOVAs examining effects of underlying voicing (2 levels), C2 voicing (2 levels), and speaking rate (2 levels) on acoustic cues in C1 stops in stop clusters.

Cues

Effects

(df)

Voicing

during

closure

Closure

duration

Vowel

duration

f0_pre F1_pre

Underlying voice (1,13) <1 <1 22.1**

<1 18.8*

C2 voice (1,13) 150.5**

103.7**

93.8**

1.1 51.9**

Rate (1,13) 23.7**

97.7**

70.3**

2.9 1.1

Underlying voice C2

voice (1,13)

<1 1.6 <1 <1 1.1

C2 voice Rate (1,13) 28.5**

3.8 <1 2.9 2.4

Underlying voice Rate

(1,13)

<1 2.0 1.5 <1 1.5

Note: F values are shown; significant values are given in bold.

* p < 0.01,

** p < 0.001.

15 Heterorganic stop clusters were used to reduce the number of unreleased stops (Zsiga 2000).

Page 78: Voicing and voice assimilation in Russian stops

58

58

The results were validated using separate repeated measures ANOVAs for each

cue (closure duration, voicing during closure, vowel duration, f0 and F1 on a preceding vowel)

with underlying voicing (voiced, voiceless), C2 voicing (voiced, voiceless), and speaking

rate (slow, fast) as factors. A summary of these ANOVAs is shown in Table 8, and the

most important findings are described below.

3.5.2.1. Effects of underlying and C2 voicing

For closure duration and duration of voicing, no effect of underlying voicing was

found (Figure 11). Underlying voiced and voiceless C1 stops did not differ significantly,

with closure averaging 45 ms (SD=9) and voicing averaging 23 ms (SD=7).

Figure 11. Effects of C2 voicing and speaking rate on (a) closure duration and (b) voicing duration in underlying voiced and voiceless C1 stops in a cluster.

The effect of C2 voice on these cues, in contrast, was highly significant. C1 stops

were shorter before voiced C2 stops (M=36 ms, SD=7) and longer before voiceless C2

stops (M=57 ms, SD=10). Voicing was longer before voiced stops (M=35 ms, SD=7)

than before voiceless stops (M=11 ms, SD=6). Thus, duration of closure and voicing in

C1 stops was fully determined by laryngeal properties of C2 stops. No interaction was

0

20

40

60

80

C2:

Voiced

C2:

Voiceless

C2:

Voiced

C2:

Voiceless

Slow Fast

Du

rati

on

(m

s)

a. Closure /t/

/d/

0

10

20

30

40

50

C2:

Voiced

C2:

Voiceless

C2:

Voiced

C2:

Voiceless

Slow Fast

Du

rati

on

(m

s)

b. Voicing /t/

/d/

Page 79: Voicing and voice assimilation in Russian stops

59

59

found, suggesting that assimilation occurred before both voiced and voiceless C2 stops

and affected underlying voiced and voiceless stops identically.

For duration of a preceding vowel, a main effect of underlying voicing was

obtained. Vowels was longer before underlying voiced stops (M=71 ms, SD=10) than

before underlying voiceless stops (M=66 ms, SD=11). A main effect of C2 voicing was

also obtained, and it was greater than the effect of underlying voicing (U.voice:

ηp2=0.630, C2 voice: ηp

2=0.878). Vowels were longer before clusters with voiced stops

(M=77 ms, SD=10) and shorter before clusters with voiceless stops (M=59 ms, SD=11),

as shown in Figure 12a.

Figure 12. Effects of underlying voicing and C2 voicing on (a) duration of a preceding vowel and (b) F1 frequency on underlying voiced and voiceless C1 stops in a cluster.

The test also revealed main effects of underlying voicing and C2 voicing on F1

frequency, and the effect of C2 voicing was stronger (U.voice: ηp2=0.592; C2 voice:

ηp2=0.800). F1 was significantly lower before underlying voiced stops (M=484 Hz,

SD=83) than before underlying voiceless stops (M=514 Hz, SD=99), and the same

tendency was observed for assimilated stops before voiced and voiceless C2 (see Figure

12b). Thus, differences between underlying voiced and voiceless stops in clusters are

*

*

0

20

40

60

80

100

C2: Voiced C2: Voiceless

Du

rati

on

(m

s)

a. Preceding vowel /t/

/d/ *

*

0

100

200

300

400

500

600

C2: Voiced C2: Voiceless

Fre

qu

ency

(H

z)

b. F1 /t/

/d/

Page 80: Voicing and voice assimilation in Russian stops

60

60

revealed by differences in the durations of the preceding vowels even though the stops

themselves are completely assimilated in voicing to the following stops.

The test on f0 frequency did not reveal main effects of underlying voicing or C2

voicing. F0 averaged 177 Hz (SD=59) before C1 stops in all clusters.

3.5.2.2. Effect of speaking rate

Speaking rate, as expected, affected duration of temporal cues. Stop closure was

longer in the slow rate condition (M=53 ms, SD=10) and shorter in the fast rate condition

(M=39 ms, SD=8).

Duration of voicing was longer in the slow rate condition than in the fast rate

condition. Speaking rate interacted with C2 voicing. This was due to the fact that voicing

duration changed as a function of speaking rate only before voiced stops (t(13)=6.79,

p<0.001). Before voiceless stops, changes in voicing duration were not significant (t<1).

Speaking rate affected vowel duration. Vowel length averaged 77 ms in the slow

rate condition (SD=10) and 60 ms (SD=11) in the fast rate condition. No interaction was

found (F<1).

No effect of rate was obtained for f0 and F1 frequencies, suggesting the

configuration of the glottis did not change across different speaking rate conditions.

3.5.2.3. Distribution of voicing in C1 stops

Figure 13 shows the distributions of voicing in the first stops in a cluster before

voiced and voiceless C2. Voicing distributions for underlying voiced and voiceless C1

stops show a complete overlap in both types of cluster. C1 stops surfaced as voiced

before voiced C2 stops, with mean VR of 99.7%; C1 stops surfaced as voiceless before

voiceless C2 stops, with mean VR of 20.7%.

Page 81: Voicing and voice assimilation in Russian stops

61

61

Figure 13. Distributions of voicing during closure of C1 underlying voiceless and voiced stops before voiceless (left column) and voiced (right column) C2 stops.

Next, changes in the distributions of voicing in the slow and fast conditions were

examined. These are shown in Figure 14. The voicing in C1 stops in clusters reflects the

pattern observed earlier in word-initial and word-medial intervocalic stops. Distribution

of a voicing tail in voiceless stops does not change as a function of speaking rate. The

range is 0-28 ms in the slow rate condition, and 0-33 ms in the fast condition. The mode

was 0 ms in both rate conditions.

Figure 14. Distributions of durations of voicing during closure for voiceless (square markers) and voiced (round markers) C1 stops in a cluster in slow (light markers) and fast (dark markers) speaking rate conditions.

0

10

20

30

40

0 10 20 30 40 50 60 70

# o

f T

ok

ens

Voicing bin (ms)

Voiceless C2 /t//d/

0

10

20

30

40

0 10 20 30 40 50 60 70

# o

f T

ok

ens

Voicing bin (ms)

Voiced C2 /t//d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70

Fre

qu

ency

(%

)

Duration bin (ms)

Voiceless-Slow

Voiceless-Fast

Voiced-Slow

Voiced-Fast

Page 82: Voicing and voice assimilation in Russian stops

62

62

In contrast, the voicing in voiced C1 stops changed as a function of speaking rate.

The range shifted from 22-66 ms in slow speech to 17-48 ms in fast speech. The mode

also changed from 34 ms in slow speech to 27 ms in fast speech. Thus, the entire

distribution of voicing ‘shrank’ and moved closer to zero.

3.5.3. Stops in clusters: Examining variability in cues

The results of the analyses show that voicing in stops in clusters varies along

several acoustic parameters: presence/absence of the vocal fold vibration (voicing during

closure), closure duration, duration and phonation (F1) of a preceding vowel.

The results clearly showed that C1 stops in obstruent clusters are generally

assimilated in voicing to the rightmost obstruent. Russian (coronal) underlying voiced

and voiceless stops were pronounced as voiced when they occurred before a voiced C2

stop, with voicing during the entire closure and lower F1 frequencies. Underlying voiced

and voiceless stops were pronounced as voiceless when they occurred before voiceless

C2 stops, with voiceless closure and higher F1 frequencies.

To determine which cues had the strongest association with voicing in stops in

clusters, a series of analyses was performed using the procedure described in sections

3.3.3 and 3.4.3. The first question to answer was how much effect does each cue have

relative to the effect of voicing. This was investigated with a series of hierarchical

regression analyses. The regression analysis was performed twice. The first was

evaluation of association of each cue with underlying voicing. Each cue was the

dependent variable and sets of dummy codes for each affecting factor were independent

variables. In each regression, dummy codes for speakers (13), rate (1), C2 voice (1) were

added to the model on separate steps, and the total contribution of a factor (e.g., speaker)

was evaluated using R2

change. Then, the analysis was repeated in order to establish the

association between the cues and phonetic voicing in assimilated tokens. A summary of

these regression analyses is shown in Table 9.

Page 83: Voicing and voice assimilation in Russian stops

63

63

Table 9. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), C2 voice (2 levels), underlying voice (2 levels), and surface voice (2 levels) in stops in clusters.

Contextual factor

Cue Speaker Rate C2 voice

Underlying

voice

Surface

voice

df 13,210 1,209 1,208 1,207 1,207

Voicing during closure 0.040**

0.667***

0.021***

Closure duration 0.107**

0.195***

0.417***

Vowel duration 0.187***

0.252***

0.273***

0.022***

f0_pre 0.963***

0.004***

F1_pre 0.589***

0.181***

0.024***

Note: R2

change values are shown; missing values were not significant (p<0.05).

** p < 0.01;

*** p < 0.001.

After contextual factors were added to the model, two cues had a weak but

significant effect of underlying voicing: vowel duration (R2=0.022) and F1 frequency

(R2=0.024). When this analysis was applied against surface voicing, it revealed a weak

but significant effect of voicing only for voicing during closure (R2=0.021).

None of the cues was invariant: the cues (except voicing during closure) had

strong context effects and, more importantly, effects of C2 voicing, which is not

surprising in a case of voice assimilation. The strongest effect of C2 voicing was obtained

for voicing during closure (R2=0.662).

The second analysis asked about the relative value of these cues for

discriminating underlying voicing in C1 stops. Weights for each cue were calculated

using the formula W=( )

,

where μ1 and μ2 are the means of each category (e.g. voiced and voiceless) and σ1 and σ2

are their standard deviations. A summary is given in Table 10.

Page 84: Voicing and voice assimilation in Russian stops

64

64

Table 10. Summary of important acoustic cues as predictors of underlying voicing in C1 stops in stop clusters, pooled across all rates and contexts.

Cue Voiceless Voiced Weight

Mean (SD) Mean (SD)

Voicing during closure (ms) 22.4 (16) 29.9 (19) .024

Closure duration (ms) 53.1 (18) 50.4 (16) .009

Vowel duration (ms) 69.9 (19) 73.5 (17) .011

f0_pre (Hz) 183 (63) 182 (60) 0.0

F1_pre (Hz) 523 (103) 493 (93) .003

Note: Weight for each cue estimates reliability to predict the voicing category (voiced vs. voiceless).

All cues were very weak predictors for underlying voicing of C1 stops in clusters.

Their combined weight was very small (Wcom=.047), suggesting a great overlap of the

underlying voiced and voiceless categories in all cues.

Finally, the same analysis asked about the relative value of these cues for

discriminating phonetically voiced or voiceless assimilated C1 stops in clusters. A

summary of weights is given in Table 11.

Table 11. Summary of important acoustic cues as predictors of surface voicing in C1 stops in stop clusters, pooled across all rates and contexts.

Cue Voiceless Voiced Weight

Mean (SD) Mean (SD)

Voicing during closure (ms) 10.5 (6) 37.6 (12) .384

Closure duration (ms) 58.4 (15) 38.1 (13) .110

Vowel duration (ms) 60.0 (14) 74.4 (17) .060

f0_pre (Hz) 184 (60) 181 (58) .001

F1_pre (Hz) 545 (100) 472 (84) .009

Note: Weight for each cue estimates reliability to predict the voicing category (voiced vs. voiceless). The best predictors are shown in bold.

Page 85: Voicing and voice assimilation in Russian stops

65

65

Voicing during closure was found to be the best predictor for surface voicing

(W=0.384). The combined weight of all other cues was smaller (Wcom=0.180) than the

weight of voicing.

These results suggest that voicing during closure is the most salient cue for

voicing in stop clusters. It is the only cue that is significantly associated with surface

voicing and is the best predictor for realization of stops as voiced or voiceless. In

addition, prior tests showed that voicing during closure was the only parameter that

changed due to the speaking rate manipulation. Significant lengthening in voicing

duration in slower speech was found only in assimilated voiced C1 stops. Voicing in

voiceless stops was not affected by speaking rate manipulation.

3.6. Results IV: Devoicing in final stops

The last set of analyses examined acoustic cues in word-final stops. Out of 504

target words with word-final stops, 503 were selected for the analysis. One bilabial stop

was discarded as it was produced as fully voiced. Careful examination of pronunciation

of this phrase showed that the speaker hesitated unnaturally while pronouncing the word;

therefore, it was likely to be a speech error. The acoustic measurements are summarized

in Tables B16-19 (Appendix B). A preliminary test found no difference in voicing

duration between list and slow conditions (t<1). Thus, the data for the list condition were

dropped in favor of the analysis of final devoicing in connected speech. No difference in

voicing duration between men and women was observed (F(1,12)=2.15, p=0.169).

To assess the effects of underlying voicing (voiced, voiceless), speaking rate

(slow, fast), and place of articulation (bilabial, coronal, velar) on the acoustic properties

of underlying voiced and voiceless word-final stops, repeated measures ANOVAs were

performed on closure duration, duration of voicing during closure, VR, duration of a

preceding vowel, f0, and F1 frequencies. Table 12 summarizes the results of the tests.

Page 86: Voicing and voice assimilation in Russian stops

66

66

Table 12. Summary of ANOVAs examining effects of underlying voicing (2 levels), speaking rate (2 levels), and place of articulation (3 levels) on acoustic cues in final stops.

Cues

Effects (df)

Voicing

during

closure

Closure

duration

Vowel

duration

f0_pre F1_pre

Underlying voice (1,13) 2.1 1.5 13.4**

1.6 15.3**

Rate (1,13) 4.6^ 43.3***

74.4***

1.0 1.1

Underlying voice Rate

(1,13)

1.8 <1 <1 <1 10.0**

Place (2,26) 10.6***

18.1***

12.4***

2.7 34.8***

Place Rate (2,26) 1.2 1.1 <1 3.4^ 1.1

Note: F values are shown; significant values are given in bold.

^ p=0.05;

** p < 0.01,

*** p < 0.001.

The effects of underlying voicing and speaking rate are reported in the sections

below. In addition, an effect of place of articulation was obtained for all cues except f0.

Duration of closure was the longest for bilabials (M=78 ms, SD=15) and shorter for

coronals (M=64 ms, SD=15) and velars (M=65 ms, SD=14), with no significant

difference between the latter two (t(13)<1). Bilabial stops had longer voicing duration

(M=16 ms, SD=8) than in coronals (M=12 ms, SD=7) and velars (M=12 ms, SD=7). This

is consistent with the pattern observed in intervocalic stops.

Place of articulation also effected vowel duration. The vowel were longer before

coronals (M=99 ms, SD=16) and shorter before bilabials (M=92 ms, SD=17) and velars

(M=90 ms, SD=18).

An effect of place of articulation was obtained for F1 frequency: F1 was lower

before velar (M=406 Hz, SD=65) and bilabial stops (M=414 Hz, SD=72) than before

coronal stops (M=452 Hz, SD=74). No effect of place was obtained for f0.

Page 87: Voicing and voice assimilation in Russian stops

67

67

3.6.1. Effect of underlying voice

For closure duration, a main effect of underlying voicing was not found. Closure

duration of underlying voiced and voiceless stops did not differ significantly (see Figure

15a), averaging 68 ms (SD=16) for underlying voiced and 70 ms (SD=13) for underlying

voiceless stops.

Both underlying voiced and voiceless stops were produced with a short voicing

tail into closure, suggesting devoicing had occurred. Interaction with place revealed that

the underlying voicing contrast was not neutralized in bilabial stops (F(1,13)=9.69,

p<0.001). Voicing was longer in underlying /b/s (17 ms) than in /p/s (14 ms). No

significant difference in voicing duration was found in coronal and velar stops (M=12 ms,

SD=8; F<1) (Figure 15b).

Figure 15. Effects of place of articulation and underlying voicing on (a) closure duration and (b) voicing into closure in final stops.

Duration of a preceding vowel, in contrast, was affected by underlying voicing for

all places of articulation (Figure 16a). Vowels were longer before underlying voiced

stops (M=97 ms, SD=18) and shorter before underlying voiceless stops (M=91 ms,

SD=15).

0

20

40

60

80

100

Bilabial Coronal Velar

Du

rati

on

(m

s)

a. Closure Voiceless

Voiced

*

0

10

20

30

40

Bilabial Coronal Velar

Du

rati

on

(m

s)

b. Voicing Voiceless

Voiced

Page 88: Voicing and voice assimilation in Russian stops

68

68

Figure 16. Effect of speaking rate on (a) closure duration and (b) duration of a preceding vowel of word-final underlying voiced and voiceless stops.

No effect of underlying voice was obtained for f0 frequency. The difference in

fundamental frequency before underlying voiced and voiceless stops was only 2 Hz,

which was not significant.

The same analysis performed on F1 frequencies yielded an effect of underlying

voicing (Figure 16b). F1 was significantly lower before underlying voiced stops (M=414

Hz, SD=74) than before voiceless stops (M=434 Hz, SD=77).

Thus, devoicing was found in final stops as speakers did not distinguish between

duration of closure and voicing in voiced and voiceless stops. However, devoicing did

not result in complete neutralization as speakers preserved the underlying voicing

contrast in duration and F1 frequency of a preceding vowel.

3.6.2. Effect of speaking rate

Speaking rate affected duration of stop closure (Figure 16a): stops were longer in

the slow condition (M=81 ms, SD=16) than in the fast condition (M=57 ms, SD=12).

Duration of a preceding vowel was also affected by speaking rate: vowel length averaged

115 ms in slow rate condition (SD=21) and 73 ms (SD=14) in fast speech (Figure 16b).

The statistical test revealed that speaking rate marginally affected voicing

duration (F(1,13)=4.57, p=0.052). Duration of a voicing tail into closure was slightly

*

*

0

20

40

60

80

100

120

140

Slow Fast

Du

rati

on

(m

s)

a. Vowel Voiceless

Voiced

*

390

400

410

420

430

440

450

Slow Fast

Fre

qu

ency

(H

z)

b. F1 Voiceless

Voiced

Page 89: Voicing and voice assimilation in Russian stops

69

69

longer in the fast rate condition (M=14 ms, SD=7) and shorter in the slow rate condition

(M=12 ms, SD=6).

No main effect of speaking rate was obtained for F1, but rate interacted with

underlying voicing, revealing that F1 was significantly lower before underlying voiced

stops only in slow speech (Vd.: M=410 Hz, SD=64; Vl.: M=440 Hz, SD=71;

F(1,13)=18.1, p<0.01), but F1 was not different in fast speech (Vd.: M=425 Hz, SD=60;

Vl.: M=427 Hz, SD=62; F(1,13)=2.22, p=0.160).

3.6.3. Distribution of voicing

Examination of distributions of voicing duration revealed that there is an almost

complete overlap between underlying voiced and underlying voiceless categories in both

speaking rate conditions, as shown in Figure 17.

All underlying voiceless stops were produced as voiceless, with a short voicing

tail into closure, averaging 15% in slow speech and 25% in fast speech. Most underlying

voiced stops were also produced as voiceless, with a short voicing tail for 16% of closure

duration in slow speech. In fast speech, some speakers (S3, S11, and S13) failed to

devoice four bilabial stops (3.6% of underlying voiced tokens) and produced them as

voiced during the entire closure. When these tokens were removed from the data, VR of

underlying voiced stops in fast speech was 27%, which was not significantly different

from the ratios observed in underlying voiceless tokens.

In slow speech, the distribution was in the range between 0-35 ms for underlying

voiced and 0-33 ms for underlying voiceless stops. The range increased slightly in fast

speech: between 0 ms and 38 ms for underlying voiced stops and between 0 ms and 39

ms for underlying voiceless stops, which was not significant.

Page 90: Voicing and voice assimilation in Russian stops

70

70

Figure 17. Voicing distributions of Russian underlying voiceless and voiced stops in slow (left column) and fast (right column) speaking rate conditions, broken down by place of articulation.

3.6.4. Final stops: Examining variability in cues

The tests revealed that underlying voiced and voiceless final stops showed

evidence of neutralization of most acoustic cues: absence of vocal fold vibration during

entire closure, closure duration, and fundamental frequency of the preceding vowel

0

10

20

30

40

0 10 20 30 40 50 60

# o

f T

ok

ens

Slow:

Bilabial /p/

/b/

0

10

20

30

40

0 10 20 30 40 50 60

# o

f T

ok

ens

Fast:

Bilabial /p/

/b/

0

10

20

30

40

0 10 20 30 40 50 60

# o

f T

ok

ens

Slow:

Coronal /t/

/d/

0

10

20

30

40

0 10 20 30 40 50 60#

of

To

ken

s

Fast:

Coronal /t/

/d/

0

10

20

30

40

0 10 20 30 40 50 60

# o

f T

ok

ens

Voicing bin (ms)

Slow:

Velar /k/

/g/

0

10

20

30

40

0 10 20 30 40 50 60

# o

f T

ok

ens

Voicing bin (ms)

Fast:

Velar /k/

/g/

Page 91: Voicing and voice assimilation in Russian stops

71

71

before closure onset. Russian final stops are characterized by absence of vocal fold

vibration throughout the entire closure.

The neutralization, nevertheless, was not complete. The speakers preserved the

contrast for the acoustic cues on a preceding vowel. Vowel duration was significantly

longer before underlying voiced stops. In addition, formant frequencies of a preceding

vowel were found to be lower when the vowel occurred before an underlying voiced stop.

A series of regression analyses16 supported these results (see Table 13). Only two

cues – vowel duration and F1 – showed small (R2<0.05) but highly significant effects of

underlying voicing. None of the cues was invariant as the combined effect of context and

place was much greater than the effect of underlying voicing.

Table 13. Summary of regression analyses examining effects of speaker (14 levels), rate (2 levels), place (3 levels), and underlying voice (2 levels) on voicing cues in word-final stops.

Contextual factor

Cue Speaker Rate Place

Underlying

voice

df 13,321 1,320 1,318 1,317

Voice during closure 0.374*** 0.024*** 0.053***

Closure duration 0.198*** 0.330*** 0.093***

Vowel duration 0.185*** 0.554*** 0.019*** 0.010***

f0_pre 0.935*** 0.003***

F1_pre 0.551*** 0.076*** 0.018***

Note: R2

change values are shown; missing values were not significant (p<0.05).

*** p < 0.001.

16 The procedure was describes at length in sections 3.3.3, 3.4.5, and 3.5.3.

Page 92: Voicing and voice assimilation in Russian stops

72

72

None of the cues was found to be a good predictor of underlying voicing, either

(see Table 14).

Table 14. Summary of important acoustic cues as predictors of underlying voicing in word-final stops, pooled across all rates and contexts.

Cue Voiceless Voiced Weight

Mean (SD) Mean (SD)

Voicing during closure (ms) 12.2 (8) 13.4 (9) .015

Closure duration (ms) 79.7 (25) 76.8 (25) .005

Vowel duration (ms) 109 (39) 115 (41) .004

f0_pre (Hz) 188 (63) 188 (60) 0.0

F1_pre (Hz) 434 (103) 415 (93) .004

Note: Weight for each cue estimates reliability to predict the underlying voicing category (voiced vs. voiceless).

Although voicing during closure had the biggest weight (W=0.015), which was

greater than the combined weight of other cues (Wcom=0.013), this weight coefficient was

very small in comparison with the weights for voicing during closure obtained for stops

in contrastive, presonorant positions (e.g. Initial: W=0.377, Intervocalic: W=0.419).

These results suggest that a single cue cannot reliably predict underlying voicing in

devoiced final stops.

3.7. Effects of speaking rate and environment on voicing

duration: Omnibus analysis

The previous tests revealed that duration of voicing in phonetically voiced stops

in each position (initial, medial, cluster) was affected by speaking rate and the type of the

following segment, while duration of a short voicing tail in voiceless stops remained

relatively stable in all conditions and positions. The omnibus tests supported these

findings.

Page 93: Voicing and voice assimilation in Russian stops

73

73

The first test (repeated measures ANOVA with voicing (voiced, voiceless) and

speaking rate (list, slow, fast) was performed on duration of voicing in phonetically

voiced and voiceless stops in word-initial and word-medial position and in clusters across

three rate conditions. The coronal stops [t] and [d] (n=840) were used for the analysis.

Figure 18a summarizes the results.

Figure 18. Effects of (a) speaking rate and (b) environment on duration of voicing in voiced (dark bars) and voiceless (light bars) word-internal coronal stops [t, d], pooled across all speakers.

The test yielded a significant interaction (F(1,13)=49.8, p<0.001). Separate

ANOVAs for voiced and voiceless stops found a significant effect of rate in voiced stops

(F(1,13)=69.3, p<0.001) but no effect of rate in voiceless stops (F<1). Speakers produced

longer voicing during closure in slower speech in voiced stops in all positions within a

word.

The second test (repeated measures ANOVA with voicing (voiced, voiceless) and

position (initial, medial, cluster, final) was performed on duration of voicing in voiced

and voiceless stops in all rate conditions across the four position within a word. The

coronal stops [t] and [d] (n=1008) were used for the analysis. Figure 18b summarizes the

results. The test also yielded a significant interaction (F(1,13)=195.4, p<0.001). Separate

0

20

40

60

80

100

List Slow Fast

Du

rati

on

(m

s)

a. Rate [t]

[d]

0

20

40

60

80

100

Vowel Sonorant Obstruent Final

Du

rati

on

(m

s)

b. Environment [t]

[d]

Page 94: Voicing and voice assimilation in Russian stops

74

74

ANOVAs for voiced and voiceless stops found a strong significant effect of Position in

voiced stops (F(1,13)=259.4, p<0.001) but no effect of Position in voiceless stops

(F(1,13)=2.44, p=0.079). Speakers produced longer voicing during closure in more

sonorous environments (before a vowel or a sonorant consonant) and shorter voicing

before obstruents. Voicing duration was the shortest in final stops where the voicing

contrast is generally neutralized. Duration of a short voicing tail in voiceless stops

remained relatively stable in all positions.

3.8. Discussion and conclusions

Some of these results are consistent with some claims about voicing in Russian

(e.g. Barry 1995; Pye, 1986; Ringen and Kulikov, in press). A strong effect of underlying

voicing on acoustic cues was found for prevocalic/presonorant stops, which is interpreted

as preservation of underlying voicing. On the surface, underlying voiced stops were

produced as voiced and underlying voiceless stops were produced as voiceless in the

intervocalic position both word-initially and word-medially. In stop clusters, by contrast,

a strong effect of C2 was obtained, which is interpreted as voice assimilation. Surface

voicing in C1 stops was largely determined by the voicing properties of the following

stop. C1 stops were produced as voiced before voiced C2s and as voiceless before

voiceless C2s.

Some results were different than in previous studies (e.g. Barry 1988). In word-

final stops, an effect of underlying voicing was found for some acoustic cues (e.g.

duration of a preceding vowel, F1) and was not obtained for others (e.g. duration of

voicing, duration of stop closure, f0), indicating that the contrast is largely neutralized

and final stops are devoiced, but neutralization is not complete. Similarly, an effect of

underlying voicing was found for some cues (duration of a preceding vowel, F1) in C1

stops in clusters, suggesting voice assimilation is not complete either.

Page 95: Voicing and voice assimilation in Russian stops

75

75

The analysis of acoustic measurements for word-internal stops revealed that the

most important cue and the best predictor of the voicing category is presence/absence of

vocal fold vibration during the closure phase. In utterance-initial stops in the list

condition, the cue is realized as a contrast between negative and short lag positive VOT,

with very little overlap (1.5% of tokens) between the two categories. In intervocalic

position, the cue is realized as a difference in voicing during closure. The contrast is

between voiced stops, which were predominantly fully voiced during closure, and

voiceless stops with a short voicing tail into closure and short lag VOT. Spectral cues (f0

and F1) indicate the presence of an articulatory gesture (larynx lowering) that facilitates

vocal fold vibration. Duration of closure and duration of a preceding vowel also help to

distinguish between voiced and voiceless stops. The two cues often show a trading

relation with each other within a syllable: longer vowel duration before a voiced stops

usually correlates with shorter duration of this stop. Nevertheless, vowel duration seems

to be a more persistent cue to voicing. Speakers distinguished vowel duration before

underlying voiced and voiceless stops even in cases of assimilation and devoicing, when

differences in closure duration between voiced and voiceless stops are neutralized.

An important finding in this study is a strong effect of speaking rate on voicing.

Voicing during closure decreased as speech became faster. Importantly, the effect was

found only in (surface-)voiced stops, which are assumed to be specified with the feature

[voice]. Speakers apparently actively manipulated vocal fold vibration for production of

voicing and adjusted duration of voicing in these stops in different speaking rates. They

voiced the entire closure, which, on average, was voiced for 98.8% of its duration and

was unbroken in 95.9% of tokens. Previous studies have shown an effect of speaking rate

on VOT in initial stops. The results of this experiment show that voicing changes as a

function of speaking rate in voiced stops in different positions within a word. The

findings are consistent with claims that speakers change cues that implement the features

Page 96: Voicing and voice assimilation in Russian stops

76

76

of contrast (Beckman et al 2011) and they strongly support previous claims that the

feature of contrast in Russian is [voice].

Voicing during closure in word-internal stops gradually decreased before less

sonorous segments: sonorant consonants and stops. Duration of voicing in voiced stops is

contingent on the category of the following segments and varies along the sonority scale.

It is the longest before vowels, the most sonorous segments, and then gradually decreases

before sonorant consonants, and is the shortest before voiced stops.

This study revealed an effect of a sonorant consonant on voicing. Some previous

studies (Docherty 1992, Van Alphen and Smits 2005) report that voiced stops had shorter

(pre)voicing or shorter voicing during closure when they occurred before a sonorant

consonant. In addition, voiced stops before sonorant consonants exhibited breaking of

continuous voicing during closure more often than in a prevocalic position.17

Temporal cues in voiceless stops, which are assumed to be phonologically

unspecified for a laryngeal feature, were not affected by environment or speaking rate.

VOT durations remained relatively stable across all speaking rates in line with the pattern

observed in Kessinger and Blumstein 1997, Magloire and Green 1999, and Solé and

Estebas 2000. In addition, no effect of speaking rate was found on durations of a short

voicing tail into closure in voiceless stops, as predicted by the model. Voicing into

closure and VOT in voiceless stops were not affected by environment, either. Durations

of voicing and VOTs did not vary in prevocalic and presonorant positions.

The results show that voicing in voiced intervocalic stops in Russian is different

from voicing in aspirating languages like English or German. As reported in Lisker and

Abramson (1967), Docherty (1992), Beckman et al, in press., roughly half of the voiced

17 A study by Ringen and Suomi (2012) did not find this effect. It is possible that the significance

level was not reached because of the small number of voiced stops before sonorant consonants (17% of all

tokens) in the sample.

Page 97: Voicing and voice assimilation in Russian stops

77

77

intervocalic stops in these languages are produced with broken voicing. In line with

Ringen and Kulikov (in press), the pattern observed in Russian in this study is very

different: more than 95% of voiced stops in the intervocalic position were produced with

voicing that continued during entire closure across all speaking rates.

Another important finding is that there are traces of underlying specification in

assimilated or devoiced stops. Although voicing (i.e. vocal fold vibration) in C1 stops

was completely determined by C2 voicing, nonetheless speakers showed evidence of

anticipatory lowering of the larynx before underlying voiced stops in all clusters. Even

though speakers devoiced C1 stops before voiceless C2s, they produced lower F1 before

underlying voiced stops than before underlying voiceless stops. The same anticipatory

mechanism was observed in voiced clusters. Speakers had a more prominent larynx

lowering gesture (lower F1 values) before underlying voiced stops than before underlying

voiceless stops. Similarly, lower F1 values were found before underlying voiced final

stops, suggesting that no complete neutralization occurred. No neutralization was found

in the duration of a preceding vowel for assimilated and devoiced stops. Speakers had a

strong tendency to produce a significantly longer vowel before underlying voiced stops.

Some of these results are different from results previously reported in literature.

Recall that Barry (1988) and Dmitrieva et al (2010) did not find significant differences in

vowel duration before underlying voiced and voiceless word-final stops. This

discrepancy in results can be explained by the bigger pool of speakers and the different

method of analysis used in this study. Barry (1988) had 8 speakers and Dmitrieva (2010)

had only 4 monolingual speakers of Russian. The graphs presented in Barry 1988 suggest

that at least 6 out of 8 speakers produced longer vowels before underlying voiced word-

final stops but the differences were not significant. In this experiment, 13 out of 14

speakers reached a significance level in production. Therefore, it is possible that there is

variation among speakers in the degree to which they preserve a difference in vowel

duration.

Page 98: Voicing and voice assimilation in Russian stops

78

78

In addition, both Barry and Dmitrieva et al used a t-test, whereas a factorial

repeated measures design was used to analyze results in this experiment. This test

analyzes each factor independently and can account for confounding effects of other

factors (e.g. manner or place of articulation, or within-speaker variation). Thus, the test in

this experiment provided a more detailed comparison of vowel duration before

underlying voiced and underlying voiceless stops. As was shown in some studies (e.g.

McMurray et al 2010, Cole et al 2010), speakers are capable of partialling out

confounding effects of other phonetic variables, (e.g. gender, place of articulation, quality

of a neighboring vowel etc.) and recovering underlying properties of a segment.

Second, speakers in the three studies varied in the range of difference in closure

duration between underlying voiced and voiceless final stops. The mean difference of 3

ms in closure duration in this study was not found to be significant, nor was the

difference significant in Barry (1988). Dmitrieva et al (2010), in contrast, report that

speakers produced closure of final stops with a significant mean difference of 16 ms. This

difference might be a result of the differences in stimuli used in these studies. Dmitrieva

et al used short monosyllables that were minimal pairs (e.g. kot ‘cat’, kod ‘code’)

whereas longer, disyllabic words were used in this study. Speakers of Russian have been

shown to distinguish between minimal pairs in perception and recover underlying

differences in short words (Matsui 2011).

To conclude: the results of acoustic measurements support the claim that the

feature of contrast in Russian is [voice]. Voicing during closure was the most salient cue

and the best predictor of the voicing category. Duration of voicing changes as a function

of speaking rate and this change occurs only in voiced stops, which are assumed to be

phonologically specified with [voice]. Phonetic context can affect voicing duration.

Duration of voicing in intervocalic voiced stops is shorter before sonorant consonants

than before vowels. Other cues, such as duration of a preceding vowel and F1, are also

Page 99: Voicing and voice assimilation in Russian stops

79

79

diagnostic of voicing processes (voice assimilation and final devoicing). They have

effects of both underlying voicing and C2 stops, the effects of C2 being stronger.

These results are used as reference points to investigate three cases of Russian

voice assimilation in prepositions across a clitic boundary: in regular CC-clusters, in

obstruent-sonorant-obstruent clusters, and in clusters with a voiced labiodental fricative

/v/. Chapters 4-6 provide the acoustic analyses of voicing and voice assimilation in

prepositions in these environments.

Page 100: Voicing and voice assimilation in Russian stops

80

80

CHAPTER IV

EXPERIMENT 2: VOICE ASSIMILATION IN PREPOSITIONS

4.1. Background

Voice assimilation across a clitic boundary presents a case of ‘on-line’

assimilation. It is, to some extent, different from voice assimilation in word-internal

clusters as speakers cannot have stored underlying representations for all possible

prepositional phrases. Each time a preposition is pronounced before a word that begins

with an obstruent, speakers have to determine the voicing properties of target sounds. The

data on voice assimilation in clusters in prepositions in Russian have not been

systematically gathered. Several sources report that voice assimilation in prepositions is

either not regular or is incomplete.

Burton and Robblee (1999) argue that incomplete assimilation in prepositions is

an effect of the clitic boundary. They tested five native speakers of Russian who

produced obstruent clusters across a clitic boundary in prepositional phrases. Their results

show that voice assimilation in obstruents is incomplete in slow speech in Russian. For

example, voicing duration closure in most underlying voiced C1 stops in /ds/ clusters was

more than 50% of stop closure duration, averaging 68%. This is not what is usually

observed in word-internal voiceless stops (see Chapter 3 of this study). In addition, many

/t/s in /ts/ clusters had a range of voicing similar to /d/ in this position (58-100% of stop

closure). Voicing for the bigger part of its duration in underlying voiceless C1 stops is not

what is expected in a fully voiceless environment.

Lihtman (1980) claims that there is a trend in Russian to devoice final obstruents

in prepositions similar to word-final devoicing. Kalenčuk and Kasatkina (1999) suggest

that this trend was temporary because devoicing in speakers’ speech was not observed in

the 90s. But these authors use impressionistic transcriptions rather than instrumental

measurements.

Page 101: Voicing and voice assimilation in Russian stops

81

81

The goal of this experiment was to compare parameters of voicing in clusters

across a clitic boundary with word-internal clusters and establish the phonetic properties

of assimilated obstruents in prepositions. The experiment was set up to test three

hypotheses about voice assimilation in clitics. The null hypothesis is that stops in clitics

behave in the same way as do word-internal stops. The first alternative hypothesis is that

the prosodic boundary affects voice assimilation and causes incomplete assimilation. This

should be manifested as shorter voicing in an underlying voiceless C1 before a voiced

C2, as well as longer voicing in an underlying voiced С1 before a voiceless C2 in clitics

compared to word-internal stop clusters. The second alternative hypothesis is that stops in

clitics undergo final devoicing. This should be manifested as shorter voicing or even

absence of voicing in underlying voiced C1 stops in all clusters in clitics. The parameters

of interest – closure duration, voicing during closure, and ratio of voicing to closure

duration (VR), duration of a preceding vowel, as well as f0 and F1 frequencies taken at

10 ms before a C1 closure, – were tested for effects of underlying voicing (underlying

voiced, underlying voiceless), following segment (sonorant, voiced, voiceless), prosodic

domain (preposition, word), and speaking rate (slow, fast) using repeated measures

ANOVAs. Separate tests were run for each acoustic parameter.

4.2. Participants and stimuli

Fourteen native speakers of Russian, 7 males and 7 females, participated. They

were the same subjects that participated in Experiment 1. The participants were not aware

of the purpose of the experiment. The stimuli were phrases with prepositions that ended

in a voiceless (ot ‘from’) and voiced (nad ‘over’) stop and were followed by nouns with

initial sonorants [r], [m], with voiceless stops [p], [k], and with voiced stops [b], [g]: e.g.

nad ramoj ‘over the frame’, ot baka ‘from the tank’, nad kartoj ‘over the map’. The full

list of target phrases is given in Appendix A(2). As in the previous experiment,

Page 102: Voicing and voice assimilation in Russian stops

82

82

heterorganic stop clusters were used to reduce the number of unreleased stops (Zsiga

2000).

Three types of clusters appeared at a clitic boundary: 1) clusters where no

assimilation is expected (/tr/, /tm/, /dr/, /dm/), 2) clusters where a voiceless C1 is

expected (/tp/, /tk/, /dp/, /dk/), and 3) clusters where voicing of C1 is expected (/tb/, /tg/,

/db/, /dg/). These tokens were compared with word-internal clusters recorded in the first

experiment, which included intervocalic stop-sonorant clusters (e.g. ka/dr/a ‘frame’

Gen.sg., tea/tr/a ‘theater’ Gen.sg.), stop-stop clusters with a voiced C2 (e.g. goro/db/a

‘fencing’, molo/tb/a ‘threshing’), and stop-stop clusters with a voiceless C2 (e.g. sa/dk/a

‘cage’ Gen.sg., ka/tk/a ‘roller’ Gen.sg.). In addition to twelve target phrases twelve filler

prepositional phrases were added and the list was randomized.

4.3. Procedure and measurements

The speakers were digitally recorded as in Experiment 1. The participants were

asked to read the list of phrases in three conditions: first, as a list (henceforth list rate);

then, within a carrier phrase Skaži _____ ešče raz ‘Say ____ again’ at a comfortable

tempo (henceforth slow rate); and finally, within the same carrier phrase Skaži _____

ešče raz ‘Say ____ again’ at a fast tempo (henceforth fast rate). Subjects were instructed

to pronounce phrases as fast as they could but not at the expense of comprehensibility.

They were permitted to correct themselves if they did not like their reading, in which case

the last reading was selected for analysis. They read the list of phrases three times, but

only the second and third readings were recorded. The total number of target tokens was

1512 (18 target phrases x 3 speech conditions x 2 readings x 14 speakers).

28 tokens were discarded due to absence of audible and visible (on a spectrogram)

release; these were produced by speakers 5, 6, 9, 10, 11, 14 in a roughly equal proportion.

Thus, 1484 tokens were analyzed. The segments were manually marked for boundaries in

Page 103: Voicing and voice assimilation in Russian stops

83

83

PRAAT (Boersma &Weenink 2011). The same criteria and measurements were applied

as in Experiment 1.

4.4. Results

Preliminary analysis did not reveal significant differences in voicing between the

list and slow conditions (t(13)=1.3, p=0.216); thus, the results for the list condition were

dropped to allow for an analysis of voicing in more natural speaking conditions. The

acoustic cues (voicing duration, closure duration, voicing ratio, vowel duration, f0, and

F1) were examined using separate repeated measures ANOVAs with underlying voice

(voiced, voiceless), prosodic domain (word, preposition), following segment (sonorant,

voiced, voiceless), and speaking rate (slow, fast) as factors. The results for each cue are

summarized in the sections below.

4.4.1. Closure duration

The measurements of closure duration in C1 stops are presented in Table C1

(Appendix C). A main effect of speaking rate (F(1,13)=112.9, p<0.001) revealed that

stops were on average longer in slow speech (M=59.7 ms, SD=11) and shorter in fast

speech (44.8 ms, SD=10) (see Figure 19a).

Figure 19. Effects of (a) speaking rate and (b) underlying voicing on closure duration of underlying /d/ and /t/ in consonantal clusters.

* *

0

20

40

60

80

Preposition Word

Du

rati

on

(m

s)

a. Rate Slow

Fast

* *

0

20

40

60

80

Son V-d V-less Son V-d V-less

Preposition Word

Du

rati

on

(m

s)

b. U voice Voiceless

Voiced

Page 104: Voicing and voice assimilation in Russian stops

84

84

A main effect of following segment (F(2,26)=38.78, p<0.001) indicated that stops

were longer before a sonorant consonant (M=60 ms, SD=14), shorter before a voiceless

stop (M=55 ms, SD=10), and the shortest before a voiced stop (M=42 ms, SD=8); all

conditions were different from each other (p<0.05). Due to a significant three-way

domain following segment voice interaction (F(2,26)=3.99, p<0.05) separate F-tests

were run for each cluster.

As shown in Figure 19b, the underlying contrast in closure duration between

voiced and voiceless stops was found in both domains only before a sonorant consonant

(Vd: M=52 ms, SD=13; Vl: M=68 ms, SD=14; F(1.13)=71, p<0.001), but not before a

voiced stop (M=42 ms, SD=8; F<1) or a voiceless stop (M=55 ms, SD=10; F<1)

suggesting that assimilation occurred before obstruents but the underlying voicing

contrast was not neutralized in a presonorant position.

Stop closure in prepositions was significantly longer than in words before a

sonorant consonant (Prep.: M=64 ms, SD=15; Word: M=56 ms, SD=12; F(1,13)=13.04,

p<0.01) and before a voiced stop (Prep.: M=48 ms, SD=9; Word: M=35 ms, SD=7;

F(1,13)=50.92, p<0.001); only marginal difference was found before a voiceless stop

(Prep: M=53 ms, SD=9; Word: M=57 ms, SD=10; F(1,13)=3.55, p=0.082).

4.4.2. Duration of voicing

Mean values of voicing during closure are given in Table C2 (Appendix C).

Observation of distributions of voicing during closure (Figure 20) showed that the

underlying voicing contrast in C1 stops was preserved in a presonorant position. In stop

clusters, the underlying contrast was completely neutralized: distributions of underlying

voiced and underlying voiceless C1 stops overlapped completely before voiced C2 as

well as before voiceless C2. The distributions of voiced stops were more compressed in

fast speech, suggesting manipulation of speaking rate affected duration of voicing in

Page 105: Voicing and voice assimilation in Russian stops

85

85

voiced stops. This pattern was consistent with the pattern that was observed earlier in

word-internal stops (see sections 3.4.3 and 3.5.2.3 in Chapter 3 for comparison).

Figure 20. Distributions of voicing during closure in underlying voiceless and voiced stops in prepositions, broken down by speaking rate (slow, fast) and following segment (sonorant, voiceless C2, voiced C2).

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Slow:

Sonorant Voiceless

Voiced

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Fast:

Sonorant Voiceless

Voiced

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Slow:

Voiced Voiceless

Voiced

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Fast:

Voiced Voiceless

Voiced

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Voicing bin (ms)

Slow:

Voiceless Voiceless

Voiced

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90100

# o

f T

ok

ens

Voicing bin (ms)

Fast:

Voiceless Voiceless

Voiced

Page 106: Voicing and voice assimilation in Russian stops

86

86

A repeated measures ANOVA with underlying contrast (voiced, voiceless),

prosodic domain (word, preposition), following segment (sonorant, voiced, voiceless),

and speaking rate (slow, fast) as factors confirmed these observations. The results are

summarized in Figure 21.

Figure 21. Effects of (a) following segment and (b) speaking rate on duration of voicing during closure of underlying voiced and voiceless stops in consonantal clusters.

Crucially, the main effect of prosodic domain (henceforth domain) was found

(F(1,13)=54.59, p<0.001) and no interaction with voicing (F(1,13)=3.74, p=0.075) or rate

(F<1). Voicing was, on average, longer in stops in prepositions than in stops in word-

internal clusters (Prep: M=33 ms, SD=9; Wd: M=25 ms, SD=7). Interaction with

following segment (F(2,26)=12.55, p<0.001) revealed that the significant difference in

voicing duration between the two domains was greater (12 ms) before a voiced stop and

smaller before a sonorant (5 ms) and a voiceless stop (6.5 ms).

The test yielded main effects of underlying voicing (F(1,13)=130.61, p<0.001),

following segment (F(2,26)=96.68, p<0.001), and their interaction (F(2,26)=151.8,

p<0.001). The underlying voicing contrast was preserved in both domains (see Figure

21a) before a sonorant (Vd: M=50.6 ms, SD=12; Vl: M=14.2 ms, SD=7), but it was

* *

0

10

20

30

40

50

60

Son V-d V-less Son V-d V-less

Preposition Word

Du

rati

on

(m

s)

a. Following segment Voiceless

Voiced*

* *

0

15

30

45

60

V-d V-less V-d V-less V-d V-less

Sonorant Voiced Voiceless

Du

rati

on

(m

s)

b. Rate Slow

Fast

Page 107: Voicing and voice assimilation in Russian stops

87

87

neutralized before voiced stops (Vd: M=41.1 ms, SD=8; Vl: M=41.3 ms, SD=8) and

before voiceless stops (Vd: M=14.8 ms, SD=6; Vl: M=14.0 ms, SD=7).

A main effect of speaking rate was obtained (F(1,13)=52.62, p<0.001), and it

interacted with voicing (F(1,13)=13.84, p<0.01) and following segment (F(2,26)=21.98,

p<0.001). The interactions revealed (see Figure 21b) that in both domains speaking rate

affected only stops with phonetic voicing: underlying voiced stops before a sonorant

(Slow: M=57 ms, SD=13; Fast: M=45 ms, SD=12; F(1,13)=34.17, p<0.001) and before a

voiced stop (Slow: M=47 ms, SD=8; Fast: M=35 ms, SD=7; F(1,13)=51.49, p<0.001).

No significant difference was found in voiceless presonorant stops (M=14 ms, SD=8;

F(1,13)=1.64, p=0.223) or in stops before voiceless C2s (M=14 ms, SD=6; F<1).

Thus, the results show that C1 stops in prepositions before C2 consonants had

longer voicing than stops in word-internal clusters. Previous tests showed that speakers

tend to voice the entire closure; thus, longer duration of voicing in C1 stops in

prepositions might be attributed to a need to maintain longer voicing in the longer closure

of these stops.

4.4.3. Voicing Ratio

The next test examined the correlation between the two cues. Ratios of voicing to

closure duration were calculated to determine whether C1 stops in prepositions were

voiced during the entire closure. Mean voicing ratios (henceforth VR) of C1 stops are

shown in Table C3 (Appendix C).

A repeated measures ANOVA with domain (word, preposition) and following

segment (sonorant, voiced, voiceless) performed on VR yielded a main effect of domain

(F(1,13)=10.67, p<0.01) and interaction with following segment (F(2,26)=26.69,

p<0.01). The interaction revealed that stops in prepositions had a greater VR than word-

internal stops only in voiceless clusters (F(1,13)=7.99, p<0.05); VR was not significantly

different in presonorant stops (F<1) or in stops before voiced C2s (F<1).

Page 108: Voicing and voice assimilation in Russian stops

88

88

The test did not find a significant difference in mean VR between word-internal

stops and stops in prepositions. A difference was observed, however, in the number of

fully voiced stops. The percentage of fully voiced stops in prepositional and word-

internal clusters is shown in Table C4.

The statistical test revealed that a significant difference between stops in

prepositions and word-internal stops in clusters was obtained only for stops before

voiceless C2s (F(1,13)=6.30, p<0.05). Devoicing did not occur in 6.5% of underlying

voiced stops in prepositions, whereas devoicing in word-internal clusters was complete.

Complete voicing of underlying voiceless stops before voiced C2s was less regular in

stops in prepositions (91% of cases) compared with word-internal stops in clusters (98%

of cases), but this difference was not significant (F(1,13)=2.63, p=0.129).

4.4.4. Duration of a preceding vowel

The next analysis examined duration of a preceding vowel as a cue to voicing.

The measurements are shown in Table C5 (Appendix C). The results of the test are

summarized in Figure 22. Crucially, the test revealed a significant main effect of

underlying voicing (F(1,13)=73.05, p<0.001): vowels were, on average, longer before a

underlying voiced stop (M=78 ms, SD=11) than before an underlying voiceless stop

(M=69 ms, SD=12).

Underlying voicing interacted with following segment (F(2,26)=6.32, p<0.01),

revealing that differences in vowel duration between underlying voiced and voiceless

stops were significant in all clusters, but they were greater before a cluster with a

sonorant (12 ms) than before clusters with a voiced stop (8 ms) and a voiceless stop (8

ms).

A main effect of prosodic domain (F(1,13)=40.89, p<0.001) and

voicing domain interaction (F(1,13)=6.79, p<0.05) revealed that vowels preceding a

stop were on average shorter in a preposition (M=68 ms, SD=12) than in a word (M=79

Page 109: Voicing and voice assimilation in Russian stops

89

89

ms, SD=11) and the significant difference in duration before underlying voiced and

voiceless stops was greater in a clitic (12 ms) than in a word (7 ms).

Figure 22. The effects of (a) underlying voice and (b) following segment and on duration of a preceding vowel in prepositions and word-internally.

A main effect of following segment was also obtained (F(2,26)=201.3, p<0.001)

and an interaction with Domain (F(2,26)=86.11, p<0.001), revealing that the effect of

following segment was stronger in a word (see Figure 22b). The vowel had the longest

duration before a cluster with a sonorant (M=101 ms, SD=13) and shorter duration before

a voiced cluster (M=77 ms, SD=10) and a voiceless cluster (M=59 ms, SD=11); all

differences were significant. The effect of following segment on a preceding vowel in a

clitic was considerably weaker. Vowel duration was significantly shorter before a

voiceless cluster (M=66 ms, SD=11; F(2,26)=5.76, p<0.01); no significant difference was

obtained between vowel durations before a sonorant (M=69 ms, SD=12) and before a

voiced cluster (M=68 ms, SD=12).

Finally, a main effect of speaking rate was found (F(1,13)=68.77, p<0.001): as

expected, vowels were longer in slow speech (M=83 ms, SD=12) and shorter in fast

speech (M=64 ms, SD=11).

*

*

0

20

40

60

80

100

Word Preposition

Du

rati

on

(m

s)

a. U.voice Voiceless

Voiced

* * *

0

20

40

60

80

100

120

Sonorant Voiced Voiceless

Du

rati

on

(m

s)

b. Following segment Word

Preposition

Page 110: Voicing and voice assimilation in Russian stops

90

90

The results show that duration of a preceding vowel is a significant cue to the

underlying voicing contrast not only in stops in word-internal clusters but also in stops in

prepositions.

4.4.5. F0 and F1

The final analysis examined fundamental frequency and frequency of the first

formant of a preceding vowel. A summary of the results of the measurements for

prepositional clusters is given in Table C6 (Appendix C).

No main effects or interactions were found for fundamental frequency (f0). The

underlying voicing contrast was not found in either prosodic domain (F<1).

The underlying voicing contrast was found in word-internal stops for F1

(F(1,13)=4.96, p<0.05), as shown earlier in Chapter 3. No contrast before voiced and

voiceless C1 stops was observed in prepositions, however. Speaking rate affected F1 in

prepositions: frequency was higher in slow speech (M=549 Hz, SD=98) and lower in fast

speech (M=507 Hz, SD=74), suggesting speakers produced some differences in glottal

configuration as a function of speaking rate.

4.5. Discussion and conclusion

The results strongly suggest that voice assimilation in C1 stops in clusters across a

clitic boundary is not different from voicing in C1 stops in word-internal clusters.

Ultimately, the underlying voicing contrast was preserved in a presonorant position; the

contrast was neutralized in stop clusters. Voicing in C1 was completely affected by the

voicing properties of C2 stops. Thus, from the phonological perspective, the process of

voice assimilation was observed in both prosodic domains.

The results provide enough evidence to reject the “final devoicing” hypothesis, as

claimed in Lihtman (1980). Incomplete voicing in C1 stops was occasionally found not

only in prepositions but also in word-internal clusters. No difference in voicing was

found between underlying voiced presonorant stops in prepositions, which may,

Page 111: Voicing and voice assimilation in Russian stops

91

91

according to the “final devoicing” hypothesis, be devoiced, and underlying voiced

presonorant stops in word-internal clusters, where final devoicing is impossible. The

results suggest that incomplete voicing is the effect of a consonant cluster rather than the

result of occurrence of a stop preposition-finally. I conclude that no “final devoicing” is

found in final stops in Russian prepositions.

The results of voice assimilation are more consistent with the “incomplete

assimilation” hypothesis (Robblee and Burton 1999). The two domains were different in

the implementation of voicing in clusters. Three major differences were found between

clusters in prepositions and word-internal clusters. First, assimilation was less regular in

stops in prepositions as no devoicing of underlying voiced stops before voiceless C2s was

observed in 6.5% of tokens. Second, devoiced stops in prepositions had much longer

duration of voicing into closure than voiceless stops in word-internal clusters. This

difference was found both in absolute duration of voicing and in VR. Stop closure in

voiceless stops in prepositions was voiced on average for almost 30% of its duration in

slow speech and 45% in fast speech, but in word-internal voiceless stops it was voiced

only for 20% in slow speech and 25% in fast speech. Finally, assimilated stops in

prepositions left fewer traces of underlying voicing than stops in word-internal clusters.

These traces were registered as differences in F1 before the closure of a C1 stop and

difference in duration of a preceding vowel. Recall that speakers produced a difference

between underlying voiced stops and assimilated voiced stops in word-internal clusters:

the former were produced with a lower F1. Speakers also differentiated between

underlying voiced and voiceless C1 stops in prepositions in the duration of a preceding

vowel, which was longer before underlying voiced stops. The results show that stops in

clusters across a clitic boundary had traces of underlying voicing only in duration of a

preceding vowel. No traces in F1 were observed.

Some results, nevertheless, are not consistent with the “incomplete assimilation”

hypothesis. This hypothesis explains why some underlying voiced C1 stops in voiceless

Page 112: Voicing and voice assimilation in Russian stops

92

92

clusters are produced as partially voiced, but it does not explain longer voicing into

closure in underlying voiceless stops before voiceless C2 stops, i.e. in the environment

where no effect of underlying voicing is expected. Several factors are likely to

simultaneously affect production of voicing in stops in prepositions.

First, speaking rate affects voicing. Note that variation in prepositions was found

more often in fast speech. Speakers may fail to control proper timing of articulatory

gestures during spontaneous assimilation in fast speech. Voicing tails in voiceless stops

were often found to be slightly longer in fast speech than in slow speech in all

environments. This is indicative of a bigger overlap in articulatory gestures (see Zsiga

1994 for details on consonant overlap).

Next, the prosodic boundary may partly attenuate the results of voice assimilation.

The clitic boundary may provide an environment in which speakers exercise less control

over potentially assimilated stops. The overlap in articulatory gestures (Zsiga 1994) was

observed more often in stops in prepositions: 27 C1 stops were pronounced as unreleased

in clusters across a clitic boundary, but all C1 stops (except one) were released in word-

internal clusters. When speakers do not anticipate the category of a following segment,

they may become less precise in articulatory gestures, including laryngeal gestures.

Similar variation in assimilated segments across a word-boundary was found in optional

coronal place assimilation across a word boundary in English (e.g. Barry 1985, Byrd

1996 among others) and in post-lexical palatalization before /j/ in American English

(Zsiga 1995).

In addition, these differences may mean that production of voicing in stops was

“sloppier” in prepositions because speakers were less accurate in producing the voicing

contrast in prepositions, which do not have a lexical contrast. Prepositions nad “over”

and ot “from” do not have minimal lexical pairs in Russian; therefore, maintaining a

voicing contrast in these prepositions may be a less important task than in morphemes

with minimal pairs, which have been shown with other languages to be more resistant to

Page 113: Voicing and voice assimilation in Russian stops

93

93

phonological alternations to facilitate lexical access in dense neighborhoods (Wedel

2002, Ussishkin and Wedel 2009). These factors will be discussed at greater length in

Chapter 8.

I conclude that voice assimilation can be considered a phonological process in

clusters across a clitic boundary in Russian since more than 90% of C1 stops underwent

transformation of their underlying specification for voicing. The clitic boundary,

however, affects low-level phonetic realization of voice assimilation. The phonetic

outcome of voice assimilation across a clitic boundary is more variable. These results

provide background for the analysis of voicing in obstruent-sonorant-obstruent clusters,

which is discussed in Chapter 5.

Page 114: Voicing and voice assimilation in Russian stops

94

94

CHAPTER V

EXPERIMENT 3: VOICE ASSIMILATION IN OBSTRUENT-

SONORANT-OBSTRUENT18 CLUSTERS: ARGUMENTS AGAINST

‘SONORANT TRANSPARENCY’

5.1. Background

The third experiment was designed to investigate the acoustic properties of

presonorant stops in obstruent-sonorant-obstruent clusters to establish whether stops in

Russian can be assimilated in voice through a sonorant. Sonorant transparency in Russian

(Jakobson 1978) is controversial. Recall that, according to the claim of sonorant

transparency, stops in prepositions assimilate in voicing to a following obstruent through

an intervening sonorant consonant:

(22) o/t ++ mg/ly o[d mg]ly ‘from haze’

na/d ++ mx/om na[t mx]om ‘over the moss’

Such assimilation is one of the most unusual phenomena that have been reported

in studies of voice assimilation in the world’s languages. It violates one of the most

common properties that have been attributed to stops: preservation of the underlying

voicing specification before a sonorant (Trubetzkoy 1969).

Not every linguist agrees about the existence of sonorant transparency in Russian.

Some linguists (Es’kova 1971, Kavitskaja 1999, Shapiro 1993) deny its existence.

According to Es’kova (1971: 245), voicing of /t/ in cases like ot mgly ‘from haze’ cannot

occur because clusters like [dmg] are not possible within one syllable. Kavitskaja (1999)

claims that sonorant transparency is not attested in the speech of several language

18 ‘Sonorant’ in this phrase refers to a sonorant consonant rather than to any sonorant segment

(e.g. a vowel). I will use the label ‘obstruent-sonorant-obstruent cluster’ to denote clusters /tmg/, /dmx/ and

the like in which “sonorant transparency” has been claimed to have been found in Russian.

Page 115: Voicing and voice assimilation in Russian stops

95

95

consultants in her study (speakers of the Moscow dialect of Standard Russian), as well as

in her own speech. Others argue that transparency is possible under certain

circumstances. Cho (1990) and Padgett (2002) argue that sonorant transparency to voice

assimilation in Russian is gradient. Ševoroškin (1971) argues that it is optional. He

departs from Hayes (1984), however, and claims that only phonetically devoiced

sonorants can trigger devoicing in preceding obstruents. Thus, /z/ in the phrase [iz mxa]

‘out of moss’ is voiced if [m] is voiced, but it is voiceless if the sonorant is devoiced: [is

m xa].

Some cases of voice assimilation before a sonorant followed by an obstruent were

investigated by Robblee and Burton (1997). They tested four speakers who pronounced

sentences with embedded phrases that contained prepositions with voiceless obstruents s

‘with/from’ and ot ‘from’ before Liquid+Vowel and Liquid+Voiced obstruent sequences

(e.g. s lišnim ‘with extra’, s ldiny ‘from the ice’) and prepositions with voiced obstruents

iz ‘from’, bez ‘without’, nad ‘over’, and pod ‘under’ before Liquid+Vowel and

Liquid+Voiceless obstruent sequences (e.g. bez riska ‘without risk’, bez rtuti ‘without

mercury’). Mean closure duration and relative amplitude of low frequency energy were

taken as an indication of the voicing properties of the first obstruent in a cluster. The

results show that the second obstruent (C2) did not affect the laryngeal state of the first

obstruent (C1). Both mean closure duration and relative amplitude of /t/ and /d/ did not

differ significantly when the liquid was followed by a vowel or by an obstruent. The two

classes of obstruents – voiceless and voiced – remained distinct from each other.

However, Robblee and Burton did not investigate changes in voicing in obstruents in fast

speech.

Kulikov (2010) examined voicing in C1 obstruents in different speaking rate

conditions. Eight speakers of Russian produced prepositional phrases with obstruent-

sonorant-obstruent clusters in a list, slow, and fast speech rate conditions. The final

obstruents in the prepositions iz ‘from’, s ‘with/off’, ot ‘from’, nad ‘over’ were

Page 116: Voicing and voice assimilation in Russian stops

96

96

pronounced before nouns and adjectives that begin with a sonorant-obstruent cluster in

matched pairs. Underlying voiced C1s were pronounced before voiceless C2s; underlying

voiceless C1s were pronounced before voiced C2s: e.g. na/d mx/om ‘over moss’, /s

mz/doj ‘with a bribe’. These cases were compared with cases with obstruent-sonorant

clusters and obstruent-obstruent clusters: e.g. ot Moskvy ‘from Moscow’, iz Tambova

‘from Tambov’.

The results strongly suggest that no phonological change of underlying voicing

occurred in C1 obstruents in obstruent-sonorant-obstruent clusters. Presonorant

obstruents in obstruent-sonorant-obstruent clusters patterned with prevocalic obstruents

and were significantly different from assimilated obstruents in obstruent-obstruent

clusters. All speakers categorically assimilated C1 obstruents in both closure/frication

duration and amount of voicing before a word beginning with an obstruent. In contrast,

assimilation through a sonorant, evidence for which was observed only in 16% of tokens,

was not categorical; it showed great phonetic variation. There was an apparent

asymmetry in the direction of assimilation. Devoicing in tokens occurred more often

(23.0%) than voicing (10.2%). Considering the fact that Russian is a true voice language,

the effect of spreading the active phonological feature [voice] is expected at least as often

as the effect of devoicing.

In addition, speakers had distinct tendencies for devoicing and voicing in

presonorant obstruents. Some speakers (S7, S8) never produced voicing even in fast

speech while others (S3, S4) pronounced one third of underlying voiceless tokens as

voiced. Variability was also observed at the segmental level. Speakers varied in what

segments they tended to voice or devoice.

Due to limitations of the design, some important comparisons were not made in

Robblee and Burton (1997) and in Kulikov (2010). In order to establish the sources of

voicing and devoicing in C1 across a clitic boundary in obstruent-sonorant-obstruent

clusters in fast speech, it is crucial to compare voiced C1s before a sonorant followed by

Page 117: Voicing and voice assimilation in Russian stops

97

97

a voiceless C2 (Voiced ++ Son++Voiceless) with voiced C1s before a sonorant followed

by a voiced C2 (Voiced ++ Son ++ Voiced), as well as before a voiced and voiceless C2

(Voiced ++ Voiced, Voiceless ++ Voiceless). In Chapters 3-4, assimilation in stop

clusters was established when voicing in C1 had an effect of C2 voicing. The same

criterion should be applied to examine presence or absence of voice assimilation in

obstruent-sonorant-obstruent clusters. Hence, the goal of this study is to examine effects

of a sonorant and C2 obstruent on voicing properties of C1 stops in obstruent-sonorant-

obstruent clusters across a clitic boundary and compare them with voicing properties of

C1 in obstruent-obstruent clusters with no intervening sonorant consonants.

Voicing in C1 stops in obstruent-sonorant-obstruent clusters was tested for an

effect of speaking rate to examine whether changes in C1 stop voicing as a function of

speaking rate are triggered by C2 or by underlying voicing. The results of Experiments 1-

2 in this study suggest that duration of voicing during closure changes as a function of

speaking rate when stops are specified with the feature [voice]. No change is expected in

voiceless, i.e. unspecified stops. Thus, if underlying voiceless C1 stops are assimilated to

voiced C2s, voicing in such stops is expected to change as a function of a speaking rate.

If no assimilation occurs, duration of a short voicing tail into closure in such stops is

expected to remain stable across speaking rates.

5.2. Participants and stimuli

Fourteen native speakers of Russian, 7 males and 7 females, participated in the

experiment. They were the same subjects that participated in the previous experiments.

After inspection of the recordings, Speaker 6’s data were excluded due to consistent

deletion of C1 stops in obstruent-sonorant-obstruent clusters. His data were discarded

from the statistical analysis; however, his results are reported in a general discussion of

processes that were observed in obstruent-sonorant-obstruent clusters in prepositions.

Page 118: Voicing and voice assimilation in Russian stops

98

98

The list of stimuli consisted of four phrases with a preposition ending in an

underlying voiceless stop (ot ‘from’) and four phrases with a preposition ending in an

underlying voiced stop (nad ‘over’). These prepositions preceded nouns which begin with

a sonorant-obstruent cluster with a voiced or voiceless C2: e.g. nad rtutju ‘over mercury’,

ot lgunji ‘from a liar’. Thus, there were target phrases with four types of obstruent-

sonorant-obstruent combinations; they were coded as /VD++Son VD/ (/dlg/, /dmz/),

/VD++Son VL/ (/drt/, /dmx/), /VL++Son VD/ (/tlg/, /tmz/), and /VL++Son VL/ (/trt/,

/tmx/). In addition, 8 phrases with the same prepositions before a noun beginning with a

single stop were used as a control category (assimilation condition). These phrases

contained clusters /dg/, /db/, /dt/, /dk/, tg/, /tb/, /tp/, /tk/: e.g. nad parom ‘over gas’, ot

gaza ‘from gas’. The full list is given in Appendix A(3). Finally, there were 12 fillers

with assorted prepositional phrases unrelated to voice assimilation. The phrases were

randomized and presented to the participants as one set.

5.3. Procedure and measurements

The speakers were digitally recorded using the same procedure as in Experiments

1 and 2. The participants were asked to pronounce (read) phrases in three speaking rate

conditions. In Condition 1 (henceforth, list rate), the target phrases were pronounced

carefully as a word list. In Condition 2 (henceforth, slow rate), the phrases were placed in

a carrier phrase Skaži ____ ješče raz (‘Say ____ once again’) and were pronounced at a

comfortable tempo. In Condition 3 (henceforth, fast rate), the phrases were placed in the

same carrier phrase Skaži ____ ješče raz (‘Say ____ once again’) and pronounced

quickly. The speakers were instructed to say the phrases as if they were trying to say

something important to a person who is leaving the room.

For each condition, the speakers read the materials three times, but only the

second and third readings were recorded. 96 test phrases (16 phrases x 3 conditions x 2

readings) for each speaker were recorded; thus, the total number of target segments was

Page 119: Voicing and voice assimilation in Russian stops

99

99

1248. 40 tokens were discarded due to absence of release, nasalization, deletion, or

metathesis of C1 stops. 1208 tokens were selected for the statistical analysis.

Voicing properties of target obstruents were investigated using the same acoustic

measurements and the same criteria as in Experiments 1 and 2. The obstruent boundaries

were set manually in PRAAT. Figure 23a illustrates a voiced C1 stop before a voiceless

C2 stop in an obstruent-sonorant-obstruent cluster; Figure 23b illustrates a voiceless C1

stop before a voiced C2 fricative.

Figure 23. Examples of C1 and C2 tokens in phrases (a) nad rtutju ‘over mercury’, spoken by S10 (m), fast rate (C1 stop closure, fully voiced, C2 closure, voiceless) and (b) nad parom ‘over steam’, spoken by S1 (f), fast rate (C1 stop closure, voiceless, C2 closure, voiceless).

Page 120: Voicing and voice assimilation in Russian stops

100

100

Assimilation of target tokens was determined using the cut-off point at the mean

VR of voiceless and voiced stops plus two standard deviations. Voiceless tokens with

voicing during closure higher than the mean were counted as assimilated (voiced); voiced

tokens with voicing during closure lower than the mean were counted as assimilated

(devoiced).

5.4. Results

The objective of the analysis was to determine whether voice assimilation takes

place in obstruent-sonorant-obstruent clusters at different speaking rates. If the first

obstruent in the cluster (C1) is assimilated in voicing, the voicing properties of this

segment should be consistent with the voicing properties of the second obstruent in the

cluster (C2). If, in contrast, the voicing properties of the C1 segment are consistent with

the underlying specification for voice, it is taken as evidence for absence of voice

assimilation.

Changes in several acoustic measurements are diagnostic of voice assimilation in

C1: closure duration, voicing duration, voicing ratio, duration of a preceding vowel, as

well as f0 and F1 frequencies. These measurements were tested for effects of C1

underlying voicing (voiced, voiceless), C2 voicing (voiced, voiceless), cluster type (with

a sonorant, without a sonorant), and speaking rate (slow, fast) using a repeated measures

ANOVA. An effect of C2 would indicate sonorant transparency. If, however, sonorant

transparency is not found in the data, cluster type is predicted to interact with C1 voicing

and C2 voicing. For the clusters with a sonorant, an effect of C1 voicing and no effect of

C2 voicing are expected, indicating preservation of the underlying contrast and no

assimilation. For the clusters without a sonorant, no effect of C1 voicing and the effect of

C2 voicing are expected, which indicates voice assimilation. Presence of sonorant

transparency can be established if no effect of C1 voicing and the effect of C2 voicing are

found in both types of clusters.

Page 121: Voicing and voice assimilation in Russian stops

101

101

The analysis involved several stages. First, the effect of the speech tempo on

production was investigated to determine whether the speaking rate manipulation had the

intended effect. Next, the voicing properties of C2 were analyzed to establish whether the

segments that may determine the results of voice assimilation remain stable. Finally, the

C1 was analyzed to assess the degree of assimilation in C1 in different speaking rate

conditions. The C1-Sonorant-C2 clusters were compared to the C1-C2 clusters to

determine whether the intervening sonorant prevents voice assimilation. A preliminary

test showed that duration of voicing was not significantly different in the list and slow

speech conditions (t(12)=0.51, p=0.619); thus, the data for the list condition were

dropped in favor of analysis in more realistic speaking conditions.

5.4.1. Effect of speaking rate

As expected, all speakers produced longer phrases in slower speech (t(12)=12.1,

p<0.001). Figure 24 shows the mean duration of target phrases in slow and fast

conditions (Slow: M=667 ms, SD=53; Fast: M=489 ms, SD=48).

Figure 24. Mean phrase duration in slow and fast speaking rate conditions.

0

200

400

600

800

Slow Fast

Du

rati

on

(m

s)

Speaking Rate

Page 122: Voicing and voice assimilation in Russian stops

102

102

5.4.2. Voicing in C2 obstruents

Figure 25 shows voicing in C2 obstruents in the slow and fast speaking rate

conditions. A repeated measures ANOVA with underlying voice (voiced, voiceless) and

speaking rate (slow, fast), performed on voicing ratio of C2 obstruents, revealed an effect

of underlying voice (F(1,12)=1684, p<0.0001). As expected, C2 obstruents retained their

underlying specification for voice. Underlying voiced obstruents were voiced for 94% of

their closure/frication; underlying voiceless obstruents were voiced for 1.2% of their

duration. No effect of rate (F(1,12)=2.42, p=0.146) was obtained: voicing ratio in C2

obstruents did not change in different speaking rate conditions.

Figure 25. Mean VR of voiced and voiceless C2 obstruents in the slow and fast rates.

5.4.3. Closure duration of C1 stops

Acoustic measurements of closure duration are summarized in Table D1

(Appendix D). A repeated measures ANOVA with cluster type (with a sonorant, without

a sonorant), underlying voicing (voiced, voiceless), C2 voicing (voiced, voiceless), and

speaking rate (slow, fast) was performed on closure duration. An effect of cluster type

was obtained (F(1,12)=12.08, p<0.01). Stops were longer when they preceded a sonorant

(M=60 ms, SD=15) and shorter in obstruent clusters (M=51 ms, SD=9).

0%

20%

40%

60%

80%

100%

Slow Fast

Voic

ing

rati

o (

%)

Speaking Rate

Voiced

Voiceless

Page 123: Voicing and voice assimilation in Russian stops

103

103

A main effect of C2 voicing was obtained (F(1,12)=8.36, p<0.05) and its

interaction with underlying voice (F(1,12)=5.79, p<0.05). The interaction revealed that

underlying /d/s were significantly shorter than underlying /t/s only before voiceless C2s

(Voiced: M=55 ms, SD=12; Voiceless: M=61 ms, SD=12; F(1,12)=5.01, p<0.05). No

significant difference in duration was found before voiced C2s (M=53 ms, SD=11;

F(1,12)=1.04, p=0.326).

As expected, speaking rate affected closure duration (F(1,12)=49.91, p<0.001):

stops were longer in slow speech (M=61 ms, SD=12) and shorter in fast speech (M=50

ms, SD=12).

5.4.4. Voicing duration

Distributions of voicing duration in C1 stops in slow and fast speaking rate

conditions show that the underlying voicing contrast in presonorant position was not

neutralized as a function of voicing in the following C2 in either rate (Figure 26).

However, a bigger overlap between voicing in underlying voiceless and

underlying voiced C1 stops was found in obstruent-sonorant-obstruent clusters than in

regular obstruent-sonorant clusters (see Figure 20 a,b). In this experiment, the categorical

boundary between phonetically voiced and voiceless stops was established at 35 ms,

using a formula “mean+2 SD”. 22% of underlying voiced stops had voicing durations

shorter than 35 ms and overlapped with underlying voiceless stops in slow speech, and

26% overlapped in fast speech. Overlap was found on the other part of the distribution,

too. 8% of underlying /t/s had voicing durations longer than 35 ms and overlapped with

underlying /d/s in slow speech; 7% of underlying /t/s overlapped in fast speech. Thus, the

total overlap was 30% in slow speech and 35% in fast speech. This overlap, however,

was not caused by the following C2. The distributions in Figure 26 clearly show that the

range of the overlap was nearly identical in obstruent-sonorant-obstruent clusters with

voiced and voiceless C2 obstruents.

Page 124: Voicing and voice assimilation in Russian stops

104

104

Figure 26. Distributions of voicing during closure in presonorant /t/ and /d/ stops in obstruent-sonorant-obstruent clusters, broken down by speaking rate (slow, fast) and C2 type (voiced C2, voiceless C2).

Acoustic measurements of duration of voicing are summarized in Table D2

(Appendix D). A repeated measures ANOVA with cluster type (with a sonorant, without

a sonorant), underlying voicing (voiced, voiceless), C2 voicing (voiced, voiceless), and

speaking rate (slow, fast) was performed on voicing duration.

Crucially, a significant interaction of cluster type with underlying voicing

(F(1,12)=95.9, p<0.001) and C2 voicing (F(1,12)=127.1, p<0.001) was found (see Figure

27a), revealing that C1 stops had retained underlying voicing when they were in a

presonorant position (Voiced: M=49.3 ms, SD=14; Voiceless: M=18.0 ms, SD=7;

F(1,12)=144.4, p<0.001), with no effect of C2 voicing (F<1). In obstruent-obstruent

clusters, no effect of underlying voicing in C1 stops was found (Voiced: M=34 ms,

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Slow:

Voiced C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Fast:

Voiced C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Voicing bin (ms)

Slow:

Voiceless C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Voicing bin (ms)

Fast:

Voiceless C2

/t/

/d/

Page 125: Voicing and voice assimilation in Russian stops

105

105

SD=7; Voiceless: M=31 ms, SD=8; F<1), but the effect of C2 was, in contrast, significant

(Voiced: M=47 ms, SD=9; Voiceless: M=18 ms, SD=7; F(1,12)=220.5, p <0.001).

Speaking rate affected voicing in C1 (F(1,12)=8.64, p<0.05), but due to a

significant rate C2 voice cluster interaction (F(1,12)=15.14, p<0.01), the effects of

rate in clusters with and without a sonorant were investigated separately (see Figure 27b).

In obstruent-obstruent clusters, rate interacted with C2 voice (F(1,12)=40.69, p<0.001)

rather than with underlying voice (F<1), revealing that duration of voicing changed as a

function of speaking rate only before voiced C2s (Slow: M=53 ms, SD=10; Fast: M=42

ms, SD=8; F(1,12)=35.54, p<0.001). Voicing in voiceless stop clusters did not change as

a function of speaking rate (F<1). This change was observed only in phonetically voiced

stops according to the pattern found for word-internal stops.

Figure 27. Effects of cluster type (a) and speaking rate (b) on duration of voicing in C1 stops, pooled across 13 speakers.

In clusters with presonorant C1 stops, no effect of rate was obtained (F<1) and

rate did not interact with C2 or underlying voicing (F<1). Mean difference (5 ms)

between slow and fast rates observed in voiced presonorant stops was not significant

(Slow: M=51 ms, SD=13; Fast: M=47 ms, SD=15; F(1,12)=1.17, p=0.300). Although no

* *

0

10

20

30

40

50

60

C2 Voiced C2 V-less C2 Voiced C2 V-less

C1-Son-C2 C1-C2

Du

rati

on

(m

s)

a. Cluster type /t/

/d/ *

0

10

20

30

40

50

60

Voiced V-less C2 Voiced C2 V-less

C1-Son-C2 C1-C2

Du

rati

on

(m

s)

b. Rate Slow

Fast

Page 126: Voicing and voice assimilation in Russian stops

106

106

effect of rate was found for voicing in presonorant stops, this is still consistent with the

pattern observed in presonorant word-internal stops (see Chapter 3).

5.4.5. Voicing Ratio

In addition to the acoustic cues (closure duration and duration of voicing) voicing

ratios were calculated to examine percent of voicing during closure in presonorant stops

in obstruent-sonorant-obstruent clusters. A summary is given in Table D3 (Appendix D).

The distributions of VR in presonorant C1 stops (Figure 28) show that the

underlying voicing contrast was not neutralized in presonorant /t/s and /d/s. In slow

speech, the majority (95%) of /t/-tokens fall into the voiceless category with VR lower

than 50%; the majority (75%) of /d/-tokens were unambiguously voiced and had VR

greater than 90%.

In fast speech, 86% of presonorant /t/-tokens unambiguously fell into a

“voiceless” category, and 78% of /d/-tokens unambiguously fell into a “voiced” category.

There was some overlapping between underlying voiceless and voiced stops in all

presonorant clusters suggesting C1 stops had gradient changes in phonetic voicing.

It is crucial, however, that theses changes in voicing duration in C1 stops were not

caused exclusively by voicing of the following C2 obstruent and thus cannot be

interpreted as voice assimilation. Table D4 (Appendix D) shows that 2% of /t/s in slow

speech and 8% of /t/s in fast speech were produced with a fully voiced closure before

voiceless C2s, i.e. in the environment in which no influence of C2 voicing should occur.

Similarly, presonorant /d/s were fully voiced before voiced C2 only in 82% of cases in

slow speech and in 71% in fast speech, suggesting that shorter voicing in such clusters is

not a result of voice assimilation.

Page 127: Voicing and voice assimilation in Russian stops

107

107

Figure 28. Distributions of VR in presonorant C1 stops in clusters, broken down by C2 obstruent (voiced – upper row, voiceless – lower row) and rate (slow – left column, fast – right column), pooled across 13 speakers.

5.4.6. Vowel duration

Acoustic measurements of vowel duration are shown in Table D5 (Appendix D).

Figure 29 summarizes the results of the statistical test. An effect of C2 was observed only

in obstruent-obstruent clusters (Voiced: M=68 ms, SD=13; Voiceless: M=66 ms, SD=12;

F(1,12)=4.77, p=0.05). No effect of C2 was found before a cluster with an intervening

sonorant (M=66 ms, SD=15; F<1).

A main effect of C1 voicing was obtained (F(1,12)=49.91, p<0.001), indicating

that the preceding vowel was longer before underlying voiced stops (M=73 ms, SD=13)

and shorter before voiceless stops (M=60 ms, SD=13). An interaction with cluster type

(F(1,12)=5.21, p<0.05) revealed that all differences were significant, but the difference

was greater before presonorant C1 stops (Voiced: M=73 ms, SD=15; Voiceless: M=58

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Slow:

Voiced C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

Fast:

Voiced C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100

Fre

qu

ency

(%

)

VR bin (%)

Slow:

Voiceless C2

/t/

/d/

0%

20%

40%

60%

80%

100%

0 10 20 30 40 50 60 70 80 90100F

req

uen

cy (

%)

VR bin (%)

Fast:

Voiceless C2

/t/

/d/

Page 128: Voicing and voice assimilation in Russian stops

108

108

ms, SD=13; F(1,12)=42.27, p<0.001) and smaller before obstruent-obstruent clusters

(Voiced: M=72 ms, SD=11; Voiceless: M=62 ms, SD=14; F(1,12)=33.93, p<0.001).

Figure 29. Effects of cluster type on duration of a preceding vowel in C1 stops.

Speaking rate affected vowel duration (F(1,12)=11.61, p<0.01): vowels were

longer in slow speech (M=72 ms, SD=14) than in fast speech (M=61 ms, SD=13). Thus,

the results suggest that with respect to duration of a preceding vowel, the two types of

clusters were different. No influence of C2 was observed in vowels preceding clusters

with an intervening sonorant.

5.4.7. F0 and F1

A summary of the acoustic measurements for f0 and F1 is shown in Table D6

(Appendix D). Statistical tests did not reveal an effect of C2 voicing on f0 (F(1,12)=3.6,

p=0.081) or F1 (F<1), suggesting that no assimilation occurred. The underlying voicing

contrast for f0 was found for presonorant stops (F(1,12)=5.71, p<0.05). No underlying

voicing contrast was found for F1 (F<1). Only the effect of speaking rate was significant

for F1 (F(1,12)=32.6, p<0.001), suggesting that first formant frequencies were different

in fast speech than in slow speech (Slow: M=548 Hz, SD=95; Fast: M=496 Hz, SD=76).

* * * *

0

20

40

60

80

100

C2 Voiced C2 V-less C2 Voiced C2 V-less

C1-Son-C2 C1-C2

Du

rati

on

(m

s)

/t/

/d/

Page 129: Voicing and voice assimilation in Russian stops

109

109

5.4.8. Tokens with ‘transparency’ effect

The results provide evidence that all speakers produced at least a few presonorant

C1 tokens with greater variation in voicing duration. Some voiceless C1 stops had greater

VR (more than 50%) than typical presonorant voiceless tokens; some voiced C1 stops

were produced with incomplete voicing during closure. Waveforms and spectrograms of

two such tokens are shown in Figure 30a (longer voicing in a voiceless C1) and Figure

30b (shorter voicing in a voiced C1). Such tokens are evenly distributed within the whole

range of VRs, which suggests that there is variation in voicing duration.

Figure 30. Examples of tokens with ‘transparency effect’: (a) ot lgunji ‘from a liar’, spoken by S11 (m), fast rate (voicing of /t/ before a voiced C2) and (b) nad rtutju ‘over mercury’, spoken by S9 (m), fast rate (devoicing of /d/ before a voiceless C2).

Page 130: Voicing and voice assimilation in Russian stops

110

110

Apparently, in the past such cases have sometimes been misinterpreted as

“sonorant transparency” and it was claimed that there is a phonological rule in Russian of

assimilation through a sonorant. However, the results clearly show that these few

examples do not represent a pervasive pattern. Devoicing occurred in 22% of clusters and

voicing was found in 9% of clusters with a sonorant across all speaking rates.

Speakers tend to have different tendencies to devoice or to voice C1 stops in

clusters where the ‘transparency’ effect was observed. Table 15 shows individual results

of changes in voicing duration in /tlg/, /tmz/, /dmx/, and /drt/ clusters for 13 speakers.

Table 15. Percentage of C1 tokens with variation in voicing duration in slow and fast speech for 13 speakers.

Slow Fast

Speaker Sex Voicing Devoicing Voicing Devoicing

S1 f 0% 25% 25% 0%

S2 m 25% 0% 50% 0%

S3 f 0% 0% 0% 0%

S4 f 0% 50% 0% 0%

S5 m 0% 0% 0% 0%

S7 f 0% 50% 0% 100%

S8 m 0% 33% 0% 50%

S9 m 0% 50% 0% 100%

S10 m 0% 0% 0% 0%

S11 m 0% 0% 25% 0%

S12 f 25% 75% 0% 50%

S13 f 0% 0% 25% 0%

S14 f 50% 0% 0% 0%

Total 8% 23% 10% 20%

Note: Speaker 6, whose data were excluded from the statistical analysis, deleted all underlying voiceless tokens.

Three speakers, – S3, S5, and S10, – did not show any evidence of variation in

voicing duration. Speakers 4, 7, 8, and 9 never voiced, and speakers 2, 11, 13, and 14

Page 131: Voicing and voice assimilation in Russian stops

111

111

never devoiced. Only speakers 1 and 12 produced both voicing and devoicing, although

they were not always consistent in different speaking rate conditions. Partly, such a

preference is correlated with sex: female speakers tend to devoice, while male speakers

show a preference for voicing.

Variation observed in obstruent-sonorant-obstruent clusters can be understood

better if other cases of C1 stops and sonorants are considered. A summary is given in

Table 16.

Table 16. Percent of modified C1 stops and sonorants in obstruent-sonorant-obstruent clusters, pooled across 14 speakers and in slow and fast rate conditions.

C1 Sonorant

Modifications /t/ /d/ /r/ /l/ /m/

Changes in voice duration in C1 2.3% 4.8%

Devoicing in Sonorant 4.8% 3.0%

Nasalization in C1 1.4% 6.0%

Denasalization in Sonorant 1.8%

Place assimilation 0.9%

Metathesis 0.5%

Deletion 0.9% 0.7% 0.2%

Note: The total number of examined clusters was 435.

The results show that changes in duration of voice are not the only modification

of the consonants in such clusters. It occurred in 7.1% of all cases and it was less frequent

than nasalization of C1 stops, which occurred in 7.4% of cases. In some clusters it

coincided with denasalization of the sonorant [m] (1.8%), resulting in ‘Nasal switch’, i.e.

nasalization of [d] or [t] and denasalization and devoicing of the adjacent nasal [m] (e.g.

/nad +mxom/ [nanpxom]). In 0.9% of the cases, [m] assimilated in place to the

adjacent coronal stop. Metathesis of the consonants in a preposition was found in 0.5% of

Page 132: Voicing and voice assimilation in Russian stops

112

112

cases; it occurred in [dmx] and [dmz] clusters:19 /nad+mxom/ [danmxom],

/nad+mzdoj/ [danmzdoj]. In a few cases (1.6% total), the first stop was deleted before

[r], and the sonorant [l] was deleted once (0.2%). All these cases can be described as

cluster simplifications that result in a simpler and less marked syllable structure.

Different clusters showed different modifications. Clusters with [m] changed most

often, and all modifications except deletion were observed in clusters with [m]. The

changes often affected both C1 and a sonorant, causing a complete transformation in the

consonant category. [r] was predominantly devoiced when it was modified, but this did

not necessarily cause devoicing in the preceding C1. Deletion was found only before [r].

Clusters with [l] were the most stable: [l] did not undergo modification and did not cause

deletion of C1. Only voice assimilation was found before [l].

No direct association between devoicing of a sonorant and devoicing of the

preceding stop was found. Rather, a variation is found: both voiced and voiceless stops

occurred before voiced sonorants; voiced stops were also produced before voiceless

sonorants. The majority of segments, however, retained their laryngeal specifications

before a sonorant, which indicates presonorant faithfulness and absence of assimilation.

5.5. Discussion and conclusion

The results of the experiment provide evidence against the claim that voice

assimilation through a sonorant consonant in Russian is a “phonological rule of fast

speech” (Hayes 1984). Contrary to this claim, cases that could be interpreted as ‘sonorant

transparency’ in Russian are not bound to fast speech. In this study, they were found in

slow and fast connected speech.

19 Metathesis more often occurred in clusters with a C1 fricative, where C1 metathesized with the

following sonorant: /iz+mxa/ [imsxa], /s+mxom/ [msxom]. These cases are not analyzed here.

Page 133: Voicing and voice assimilation in Russian stops

113

113

The results of the tests strongly suggest that no phonological change of underlying

voicing occurred in C1 stops in obstruent-sonorant-obstruent clusters. Underlying voiced

presonorant stops were produced with acoustic properties consistent with voicing;

underlying voiceless presonorant stops had acoustic properties characteristic of voiceless

segments. This result was supported in a variety of acoustic measurements. No significant

effect of C2 obstruents was observed in presonorant C1 stops, suggesting that no voice

assimilation occurred. A strong effect of C2 voicing was found, in contrast, in obstruent-

obstruent clusters where assimilation occurred.

Manipulation of speaking rate also showed that differences in voicing duration as

a function of rate were triggered by C2 voicing only in obstruent-obstruent clusters. The

effect of rate on voicing in presonorant C1 stops was not observed, but it was,

nevertheless, consistent with the facts of voicing in presonorant stops examined in

Experiment 1. Apparently, voicing in stops before sonorant consonants is less responsive

to manipulation of speaking rate. Recall that voicing in presonorant intervocalic

underlying voiced stops in word-internal clusters (e.g. ka[dr]a ‘frame.Gen’) also did not

change as a function of a speaking rate.

Is “sonorant transparency” an optional phonological rule? Optional rules are

shown to operate in restricted environments (e.g. Anttila 1997), but they are categorical.

Cases of “sonorant transparency”, in contrast, show great phonetic variation. Not only do

they occurred in 7.1% of all C1 stops in obstruent-sonorant-obstruent clusters, but there

also is an apparent asymmetry in the direction of these changes. Devoicing in tokens

occurs more often (22%) than voicing (9%). Considering the fact that Russian is a true

voice language, the effect of spreading the active phonological feature [voice] is expected

at least as often as the effect of devoicing.

The results show that voicing in long clusters is more susceptible to variation.

This may be a property of a long cluster. Shorter voicing in presonorant /d/s is not

necessarily the result of devoicing triggered by the voiceless C2 obstruent. Incomplete

Page 134: Voicing and voice assimilation in Russian stops

114

114

voicing was found in presonorant /d/s followed by the voiced C2 as well. Similarly,

longer voicing in presonorant underlying voiceless stops was not necessarily caused by

the voiced C2 obstruent because the same tendency was found in presonorant /t/s

followed by the voiceless C2. Recall that in some cases underlying voiceless stops were

voiced during the entire closure in this position, which cannot be explained by voice

assimilation. In addition, variation in voicing in the C1 stops in clusters examined in this

study shows an effect of a clitic boundary, which was also observed in the prevocalic

stops discussed in Chapter 4. Next, other changes (such as nasalization or metathesis)

found in presonorant stops in obstruent-sonorant-obstruent clusters suggest that speakers

are dealing with articulatory difficulties in long clusters that violate the sonority

sequencing principle. All modifications observed in C1 stops and sonorants target cluster

simplification.

Variation was also observed at the segmental level. Speakers varied in what

clusters they tended to voice, devoice, or delete. Speakers changed phonetic properties of

the segments to make the cluster more homogenous. Sharing the feature [voice] is not the

only possibility, and speakers employed other options as well. The feature [nasal] was

shared as frequently as [voice], and in a few cases adjacent segments were assimilated in

place. The very limited number of cases of feature swapping and its irregular occurrence

in these clusters do not support a claim of phonological voice assimilation. Rather, the

results support the claim that these are low-level, phonetic modifications of long clusters.

Finally, variation was found not only in acoustic parameters of presonorant C1

stops but also in the speaker population. The results reveal that speakers had different

tendencies for devoicing and voicing before a sonorant. Some speakers never produced

voicing even in fast speech while others produced one third of underlying voiceless

tokens as voiced. Partly, such a preference is correlated with sex: female speakers tend to

produce less voicing during closure in clusters, while some male speakers show a

preference for voicing. The absence of uniformity among speakers suggests that

Page 135: Voicing and voice assimilation in Russian stops

115

115

“sonorant transparency” is not a variable phonological rule, either. Variable rules were

shown to occur across homogenous sociolinguistic communities (Labov 1969). All

participants in this experiment belong to the same age and socioeconomic group: they are

college students and live in the same neighborhood. Yet they do not have uniform results

of voice assimilation through a sonorant.

Page 136: Voicing and voice assimilation in Russian stops

116

116

CHAPTER VI

EXPERIMENT 4: VOICE ASSIMILATION BEFORE /V/

6.1. Background

The fourth experiment was designed to examine voice assimilation in stops in

prepositions before /v/. Recall that it has been claimed that /v/ before a sonorant does not

trigger voice assimilation in stops in Russian (Avanesov 1968):

(23) /tv/orec [tv] ‘creator’

/dv/orec [dv] ‘palace’

When /v/ occurs before an obstruent, it is claimed that it assimilates to it in voicing and

triggers voice assimilation in a preceding stop (Jakobson 1956, Hayes 1984):

(24) o/t vd/ovy [dvd] ‘from a widow’

na/d vp/uskom [tfp] ‘over the inlet’

However, voice assimilation before /v/ seems to be a less regular pattern than

voice assimilation in obstruent clusters. Reformatskii (1975) argues that the position

before /v/ in Russian is prone to preservation of the voicing contrast. In line with this,

Panov (1967) reports that speakers of Russian are not consistent in assimilating

obstruents in clusters with /v/: assimilation is found in some speakers and is absent in

other speakers. Absence of voice assimilation before /v/ in obstruent clusters means that

speakers preserve the underlying voicing contrast before /v/ in any position.

Unfortunately, these claims were based on impressionistic transcriptions. Even

the most recent claims that assimilation before /v/ followed by a voiced obstruent is a

pervasive pattern in Russian (e.g. Kn’azev 2006) are not supported by instrumental

analysis.

Hence, the goal of the experiment was to establish the facts about voice

assimilation before /v/ in clusters and compare these results with other cases of voice

assimilation in Russian.

Page 137: Voicing and voice assimilation in Russian stops

117

117

6.2. Participants and stimuli

Fourteen speakers of Russian participated. They were the same subjects that

participated in Experiments 1-3. They read a list of prepositional phrases, which included

a preposition that ended with a voiced stop (e.g. nad ‘over’) or a voiceless stop (e.g. ot

‘from’). The list of target phrases is shown in Appendix A(4). Each preposition was

produced before a word that began with /v/ in three contexts: (1) followed by a vowel

(e.g. Volga ‘the Volga (river)’), (2) followed by a voiced stop (e.g. vdovy ‘widows’), and

(3) followed by a voiceless stop (e.g. vtornik ‘Tuesday’). Thus, six types of clusters were

produced. In addition, 12 filler phrases with assorted prepositional phrases were added.

The list was randomized and presented to the participants as one set.

6.3. Procedure and measurements

Each target phrase was read in three speaking rate conditions: as a list, in a phrase

at a slow rate, and in a carrier phrase at a fast rate. Participants read the stimuli three

times, but only the second and the third reading were recorded. Thus, the total number of

tokens was 504 (6 tokens x 2 readings x 3 conditions x 14 speakers).

The participants’ readings were digitally recorded and analyzed using the same

procedure as in the previous experiments. Five tokens were discarded due to omission of

/v/ in a cluster (3 tokens, speakers 4, 7, 12) or absence of release of a stop (2 tokens,

speakers 5 and 14). All 5 of these tokens were produced in fast speech. Hence, 499

tokens were analyzed.

6.4. Results

The analysis was performed in several stages. First, the effect of speaking rate

manipulation on word duration was examined to determine whether changes in speaking

rate achieved the desired effect. Second, three parameters of voicing – duration of stop

closure, voicing duration, and voicing ratio – were analyzed to establish whether

assimilation through /v/ before obstruents occurred. Third, distributions of voicing ratios

Page 138: Voicing and voice assimilation in Russian stops

118

118

were examined to determine whether regular assimilation occurred before /v/.

Preliminary analysis showed that duration of voicing was different in list and slow

conditions (t(13)=3.85, p<0.001); therefore, the data for all speaking rate conditions were

used in statistical tests. No effect of gender was obtained (F<1); male and female

speakers performed in a similar fashion in this experiment.

6.4.1. Rate and word duration

The analysis (repeated measures ANOVA) showed that manipulation of speaking

rate produced an effect on word duration (F(1,13)=106.5, p<0.001). Figure 31

summarizes the results.

Figure 31. Effect of speaking rate on word duration.

Target phrases had the longest duration in the list reading (M=743 ms, SD=79),

shorter duration in the slow reading (M=669 ms, SD=58), and the shortest in the fast

reading (M=489 ms, SD=46); all differences were significant (t(13)=4.65, p<0.001;

t(13)=11.05, p<0.001). Speakers read phrases more slowly in the list condition and the

slow conditions and faster in the fast condition.

0

200

400

600

800

List Slow Fast

Du

rati

on

(m

s)

Page 139: Voicing and voice assimilation in Russian stops

119

119

6.4.2. Closure duration

The results of the acoustic measurements are summarized in Table E1 (Appendix

E). The analysis (repeated measures ANOVA with underlying voicing (voiced,

voiceless), segment type following /v/ (henceforth, following segment type) (vowel,

voiced, voiceless), and rate (list, slow, fast) revealed an effect of underlying voicing

(F(1,13)=15.9, p<0.001), which interacted with following segment type (F(2,26)=4.4,

p<0.05). Underlying voiced stops were significantly shorter than underlying voiceless

stops only before /v/ followed by a vowel (F(1,13)=27.7, p<0.001).

No effect of underlying voicing on closure duration was found before /v/ followed

by an obstruent (Voiced: F(1,13)=1.96, p=0.185; Voiceless: F<1). The difference

between duration of stop closure of underlying voiced and voiceless stops before /v/ in

obstruent clusters was not significant (see Figure 32).

Figure 32. Effects of following segment (a) and speaking rate (b) on closure duration in stops before /v/.

An effect of following segment type was obtained (F(2,26)=24.54, p<0.001).

Closure duration was the longest before /v/ followed by a vowel; but the difference in

duration between a stop followed by a cluster of /v/ and a voiced or a voiceless obstruent

was not significant (F(1,13)=2.33, p=0.151).

*

0

20

40

60

80

Vowel Voiced Voiceless

Du

rati

on

(m

s)

a. Segment type /t/

/d/

* *

0

20

40

60

80

List Slow Fast

Du

rati

on

(m

s)

b. Speaking rate /t/

/d/

Page 140: Voicing and voice assimilation in Russian stops

120

120

Finally, an effect of rate was obtained (F(2,26)=46.6, p<0.001) and it interacted

with voicing (F(2,26)=7.35, p<0.01). Closure duration was longer in the list and slow

conditions, and shorter in the fast condition. An interaction revealed that there was a

difference between underlying voiced and voiceless stops in the list (F(1,13)=34.7,

p<0.001) and slow (F(1,13)=11.6, p<0.01) conditions, but no difference in duration was

found in the fast condition (F<1). The results suggest that speakers preserved the voicing

contrast in stop duration before /v/ in prevocalic position in all speaking rate conditions.

In obstruent clusters with medial /v/, the contrast in stop duration was maintained in the

list and slow conditions, but it was neutralized in the fast condition.

6.4.2. Duration of voicing

The results of the acoustic measurements are presented in Table E2 (Appendix E).

The analysis (repeated measures ANOVA with underlying voicing (underlying voiced,

underlying voiceless), following segment type (vowel, voiced, voiceless), and rate (list,

slow, fast) applied to duration of voicing revealed a strong main effect of underlying

voicing (F(1,13)=139.7, p<0.001) and interactions with following segment type

(F(2,26)=39.0, p<0.001) and rate (F(2,26)=10.6, p<0.001).

Separate ANOVAs performed for each following segment type and speaking rate

condition showed that the underlying voicing had a strong effect on voicing duration

before /v/ followed by a vowel (F(1,13)=89.8, p<0.001) and before /v/ followed by a

voiced obstruent (F(1,13)=51.5, p<0.001). The effect of underlying voicing before /v/

followed by a voiceless obstruent was also significant (F(1,13)=6.38, p<0.05). The results

mean that speakers preserved the contrast before /v/ not only in a prevocalic position, but

also in obstruent clusters (see Figure 33a).

A main effect of rate was not obtained (F(2,26)=2.75, p=0.085), but interaction

with voicing (F(2,26)=10.62, p<0.001) revealed that an effect of rate was observed only

in underlying voiced stops (F(2,26)=14.0, p<0.001). Interaction with following segment

Page 141: Voicing and voice assimilation in Russian stops

121

121

type (F(4.52)=5.08, p<0.01) and a three-way interaction (F(4.52)=3.80, p<0.01) revealed

that voicing duration changed significantly as a function of a speaking rate in underlying

voiced stops before /v/ followed by a vowel (F(2,26)=20.96, p<0.001) and before /v/

followed by a voiced obstruent (F(2,26)=4.09, p<0.05). No effect of rate was observed in

underlying voiceless stops (F(2,26)=1.39, p=0.266) and in underlying voiced stops before

/v/ followed by a voiceless obstruent (F<1).

Figure 33. Effects of following segment type (a) and speaking rate (b) on duration of voicing in stops before /v/.

Figure 34 shows distributions of voicing duration in the list and fast rate

conditions. Using the formula “mean + 2 SD”, the category boundary between underlying

voiced and underlying voiceless stops was established at 34 ms for the list and slow

conditions and 40 ms for the fast condition. The contrast was robust in a prevocalic

position, suggesting that the underlying voicing contrast was preserved. There was no

overlapping before /v/ followed by a vowel in the list reading. More overlapping was

found in fast speech: 28.5% of underlying /d/ tokens were produced with voicing shorter

than 40 ms, and 7.1% of underlying /t/ tokens were produced with voicing longer than 40

ms.

*

*

0

10

20

30

40

50

60

Vowel Voiced Voiceless

Du

rati

on

(m

s)

a. Segment type /t/

/d/

* * *

0

10

20

30

40

50

60

List Slow Fast

Du

rati

on

(m

s)

b. Speaking rate /t/

/d/

Page 142: Voicing and voice assimilation in Russian stops

122

122

Figure 34. Distributions of voicing duration of underlying /d/ and /t/ before /v/ followed by a vowel, a voiced obstruent, and a voiceless obstruent in the list (left column) and fast (right column) conditions.

Some contrast was also preserved before /v/ followed by a voiced obstruent, but

more overlapping between the voiced and voiceless categories was found than before a

prevocalic /v/. In the list condition, all underlying /d/ tokens were produced as voiced

before the /vd/ cluster. 75% of underlying /t/ tokens were produced as voiceless with

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90

# o

f T

ok

ens

List:

Vowel

/t/

/d/

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90

# o

f T

ok

ens

Fast:

Vowel

/t/

/d/

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90

# o

f T

ok

ens

List:

Voiced C2

/t/

/d/

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90#

of

To

ken

s

Fast:

Voiced C2

/t/

/d/

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90

# o

f T

ok

ens

Voicing bin (ms)

List:

Voiceless C2

/t/

/d/

0

5

10

15

20

25

30

0 10 20 30 40 50 60 70 80 90

# o

f T

ok

ens

Voicing bin (ms)

Fast:

Voiceless C2

/t/

/d/

Page 143: Voicing and voice assimilation in Russian stops

123

123

voicing shorter than 34 ms, which indicates that no assimilation occurred in these tokens.

In the fast condition, the overlap was greater but still incomplete. 67.8% of underlying /t/

tokens fell into the voiceless category, suggesting that assimilation did not occur.

Overlap of voicing durations of stops before /v/ followed by a voiceless obstruent

was complete in all speaking rate condition, which means that voice assimilation

occurred in stops before /v/ followed by a voiceless stop.

6.4.3. Voicing ratio

A summary of voicing ratio calculations is given in Table E3 (Appendix E). The

statistical test found a main effect of underlying voicing (F(1,13)=124.8, p<0.001) and

interactions with following segment (F(2,26)=32.8, p<0.001) and speaking rate

(F(2,26)=6.39, p<0.01). Speakers on average produced underlying voiced stops with

closure voiced for 79% of their duration and underlying voiceless stops with closure

voiced for 41%.

Interactions revealed that a significant difference in VR between underlying

voiced and underlying voiceless stops was observed in the position before /v/ followed by

a vowel (F(1,13)=169.1, p<0.001) and before /v/ followed by a voiced obstruent

(F(1,13)=28.8, p<0.001). This difference was smaller in the fast condition than in slow or

list conditions. VR was not significantly different in stops before /v/ followed by a

voiceless obstruent (F(1,13)=4.58, p=0.06) across all speaking rate conditions.

The results show that speakers consistently produced a contrast between

underlying voiced and underlying voiceless stops in positions when /v/ is before a vowel

and before a voiced obstruent. The contrast was neutralized and underlying voiced stops

were produced as voiceless before /v/ followed by a voiceless obstruent.

Table E4 (Appendix E) shows the percent of fully voiced underlying voiced and

underlying voiceless stops in each of the three cluster types in the list, slow, and fast rate

conditions. Some underlying voiceless stops were completely assimilated in voicing

Page 144: Voicing and voice assimilation in Russian stops

124

124

when they occurred before /v/ followed by a voiced obstruent. 25% of underlying

voiceless stops in the list condition and 32% of underlying voiceless stops in the slow

condition were produced as completely voiced. This number increased in the fast rate

condition: 62% of underlying voiceless stops were assimilated and produced as fully

voiced.

However, when stops are voiced during the entire closure, it does not necessarily

mean that changes in duration of voicing in a C1 stop in a cluster were caused by a C2

obstruent. Note that in fast speech speakers produced 11% of underlying voiceless stops

in a prevocalic cluster and 7% of underlying voiceless stops in a voiceless cluster as fully

voiced in positions where assimilation cannot be the cause. Assimilation did not occur in

26% of /dvt/ clusters. Thus, it is not impossible for speakers to produce underlying

voiceless C1 stops as fully voiced without assimilation. After having adjusted for this

number (7%), only half of the underlying /t/s can be considered assimilated before /v/

followed by a voiced obstruent in fast speech.

6.4.4. Duration of a preceding vowel

The final analysis was to determine the effects of underlying voicing and speaking

rate on the duration of a vowel preceding C1 stops in clusters with /v/. If a vowel is

longer before an underlying voiced stop that is voiceless on the surface, this means the

underlying voicing of an obstruent is preserved in the preceding vowel even where the

underlying contrast is lost. A summary of the acoustic measurements is given in Table E5

(Appendix E).

The analysis revealed an effect of underlying voice (F(1,13)=75.7, p<0.001).

Vowels before underlying voiced stops were produced on average with longer duration

than before underlying voiceless stops (Figure 35).

A main effect of following segment was obtained (F(2,26)=9.89, p<0.01). Vowels

were, on average, longer before voiced clusters than before voiceless ones (t(13)=2.88,

Page 145: Voicing and voice assimilation in Russian stops

125

125

p<0.05); the difference between vowel and voiced cluster was not significant (t(13)=1.69,

p=0.115).

Figure 35. Effects of following segment (a) and speaking rate (b) on duration of a vowel preceding C1 stops before /v/.

Speaking rate affected vowel duration (F(2,26)=15.2, p<0.001) and interacted

with voicing (F(2,26)=5.49, p<0.01). As expected, speakers produced longer vowels in

the list and slow conditions, and shorter vowels in the fast condition. The interaction

revealed that vowel duration was significantly affected by rate before underlying voiced

and voiceless stops, but the greatest difference between voiced and voiceless stops was

found in the fast condition (t(13)=3.77, p<0.01).

Thus, the results show that speakers preserve the underlying voicing contrast in

the duration of a preceding vowel not only in prevocalic stops, but also in stops in

obstruent clusters.

6.4.5. Assimilation: Individual results

The results of previous tests showed that voice assimilation was complete in terms

of duration of voicing before /v/ followed by a voiceless obstruent (e.g. na/d vt/ornikom

[tft] ‘over Tuesday’), but voice assimilation before /v/ followed by a voiced obstruent

* * *

0

20

40

60

80

Vowel Voiced Voiceless

Du

rati

on

(m

s)

a. Following segment /t/

/d/

* *

*

0

20

40

60

80

List Slow Fast

Du

rati

on

(m

s)

b. Speaking rate /t/

/d/

Page 146: Voicing and voice assimilation in Russian stops

126

126

varied (e.g. o/t vd/ov [dvd]/[tvd] ‘from widows’). In this case it was important to

establish whether all speakers produced both variants, or whether some speakers

assimilated obstruents before /v/ and some did not. The former case would indicate that

voice assimilation before /v/ followed by an obstruent is optional and speakers vary in

production of assimilated tokens. The latter case would indicate that some speakers have

a grammar with obligatory voice assimilation before /v/ followed by an obstruent, but

some speakers have a grammar in which voice assimilation occurs only before a devoiced

/v/ followed by a voiceless obstruent, but no assimilation occurs before a /v/ followed by

a vowel or a voiced obstruent, i.e. before a voiced [v].

Table 17 presents individual results of voice assimilation in the /tvd/ and /dvt/

clusters in the list and fast rate conditions for 14 speakers. All speakers assimilated

(devoiced) stops before /v/ followed by a voiceless obstruent, resulting in production of

[tft] clusters in all speaking rate conditions.

Voicing in /tvd/ clusters was observed less often, but it was a regular pattern for

speakers 3, 6, and 11 even when reading the list. The other speakers did not assimilate /t/s

before /v/ followed by a voiced obstruent in the list condition. Speakers 8 and 13

produced half of underlying /t/s in /tvd/ clusters as voiced and half as voiceless.

Voice assimilation in /tvd/ clusters was found more often in fast speech. Nine

speakers (1, 2, 3, 5, 6, 8, 11, 13, and 14) completely assimilated underlying /t/ before a

voiced cluster and pronounced it as [dvd]. The other speakers did not assimilate

underlying voiceless stops in voiced clusters and pronounced them as [tvd].

These results are consistent with the analysis of voicing ratio: three speakers

account for 25% of fully voiced underlying voiceless stops in voiced clusters in the list

reading, and nine speakers account for 62% of fully voiced underlying voiceless stops in

voiced clusters in the fast condition.

Page 147: Voicing and voice assimilation in Russian stops

127

127

Table 17. Individual results of voice assimilation in clusters with /v/ in the list and fast rate conditions.

Speaker List Fast

/tvd/ /dvt/ /tvd/ /dvt/

SP01 – + + + SP02 – + + + SP03 + + + +

SP04 – + – + SP05 – + + + SP06 + + + +

SP07 – + – +

SP08 – + – + SP09 mixed mixed + +

SP10 – + – + SP11 + + + +

SP12 – + – + SP13 mixed + + + SP14 – + + +

Note: The plus sign (+) indicates presence of assimilation in tokens produced by a speaker; the minus sign (–) indicates no assimilation. Speakers who never assimilated stops before a voiced [v] are marked in bold. Speakers who assimilated stops in all rate conditions are marked in italics.

6.6. Discussion and conclusion

The experiment shows that voice assimilation before /v/ is a special case in

modern Russian. The results show that the underlying voicing contrast is indeed

preserved in stops before a prevocalic /v/, as has been claimed previously (Avanesov

1968). The results, however, do not support claims that stops always preserve underlying

voicing in the position before /v/ (Reformatskii 1975). All speakers produced only

voiceless stops before /v/ followed by a voiceless obstruent, suggesting assimilation does

occur in this position. The results do not fully support claims (Hayes 1984, Kn’azev

2006) that voice assimilation is obligatory before /v/ followed by an obstruent, either.

Speakers 4, 7, 9, 10, and 12 never assimilated underlying /t/s in /tvd/ clusters in any

speaking rate condition.

Page 148: Voicing and voice assimilation in Russian stops

128

128

The results partially support the claim that only some speakers assimilate stops

before /v/ followed by an obstruent (Panov 1967). Speakers indeed use one or the other

option (i.e. assimilation or no assimilation) in production, but such variation was found

only in cases with voiced [v]. No variation was found in cases with devoiced /v/ (i.e.

produced as [f] before voiceless obstruents).

The results show that speakers’ use of the assimilation pattern depends on a

speaking rate. Speakers can produce a different pattern as a function of speaking rate.

Several participants in this study (speakers 1, 2, 5, and 14) assimilated stops before /v/ in

voiced clusters in fast speech, but not in slow speech.

The results of this experiment suggest that there are two groups of speakers with

different grammars. The first group is the speakers who assimilate stops before /v/

followed by an obstruent. In this grammar, /v/ is transparent to voice assimilation, and

claims about transparency of Russian /v/ (e.g. Jakobson 1968, Hayes 1984) hold true. The

second group does not assimilate stops before /v/ followed by a voiced obstruent but does

so before /v/ followed by a voiceless obstruent. For these speakers, /v/ is indeed

idiosyncratic and schizophrenic. These speakers preserve underlying voicing in stops

before a voiced /v/ no matter what the following segment is. In this case, /v/ behaves like

a sonorant consonant. In cases when /v/ occurs before a voiceless obstruent, it assimilates

and becomes voiceless. In these clusters underlying /v/ is realized phonetically as a

voiceless labio-dental fricative [f]; it behaves like an obstruent and triggers voice

assimilation (devoicing) of a preceding stop.

Finally, the results show that even in cases where a stop assimilated in voicing

before /v/, assimilation never resulted in complete neutralization. The underlying voicing

was recorded on the vowel: vowels were longer before underlying voiced stops. This

pattern was found in all obstruent clusters, as shown in Chapters III-V. In this sense

clusters with /v/ are like all other obstruent clusters.

Page 149: Voicing and voice assimilation in Russian stops

129

129

CHAPTER VII

EXPERIMENT 5: VOICE ASSIMILATION AND FINAL DEVOICING

AT PHRASE LEVEL

7.1. Background

As was shown definitively in Chapter 3, voice assimilation across a morpheme

boundary results in complete assimilation of C1 to the voicing of C2, In addition, final

devoicing of word-final stops is complete. However, there is no complete neutralization

of the underlying voicing contrast of C1 in morpheme-internal clusters or of word-final

stops. The contrast is manifested in the duration of the preceding vowel. Voice

assimilation in Russian is reported to occur across a word boundary as well, although

some claim that it occurs inconsistently in Russian. Transcriptions of cases with obstruent

clusters across a word boundary in Avanesov (1968) show both devoiced and voiced

obstruents before the voiced word-initial obstruent which is supposed to trigger voice

assimilation. Examples with no assimilation include [stix daidjot] ‘the verse will reach’,

[rot gatovi] ‘kin ready’, [kak zima] ‘as winter’, [ʃak galanskij] ‘pace of Holland’, and

[vjek daʒivat

j] ‘live out one’s days’. Examples with assimilation include [kag daxodit] ‘as

it reaches’ and [drug gdrudu] ‘to one another’.

Very few studies address the question of what determines the results of voice

assimilation across a word boundary in Russian. As noted in Chapter 2, among the factors

that affect voice assimilation in this position, researchers mention stress (Baranovskaja

1968, Shapiro 1993), speech tempo (Kn’azev 2004), semantics (Baranovskaja 1968), and

individual variation (Paufošima and Agaronov 1971). Assimilation is claimed to be more

likely to occur when the words in the phrase have one primary stress, constitute an idiom,

or are pronounced in close contact to each other in fast speech.

Studies of Polish, another true voice language, which has, like Russian, regressive

voice assimilation and final devoicing, report that word-final stops in clusters before a

Page 150: Voicing and voice assimilation in Russian stops

130

130

word-initial voiceless stop show traces of incomplete neutralization in a sentence

(Slowiaczek and Dinnsen 1985). Slowiaczek and Dinnsen also report that devoicing in

word-final stops does not result in complete neutralization.

This experiment was designed to examine acoustic parameters of voicing at a

phrase level. Given the results of the previous experiments in this study, which showed

incomplete neutralization in voice assimilation and incomplete neutralization of the

underlying voicing contrast in word final stops, it was important to establish whether

neutralization occurs across a word boundary.

7.2. Participants

Eight native speakers of Russian, four males and four females, participated. They

were monolingual speakers who had grown up and resided in Tambov (central Russia).

Their ages ranged from 21 to 48 (mean age=37.3). They all were speakers of educated

Standard Russian.20 They had no history of speech or hearing disorders. The subjects

were paid a standard hourly rate for their participation.

7.3. Stimuli

The list of stimuli included three minimal pairs with final voiced and voiceless

stops at three places of articulation (e.g. lug ‘meadow’ – luk ‘onion’). Short monosyllabic

CVC words were used in this experiment to determine whether incomplete devoicing

would occur more often than in Experiment 1, in which longer, non-minimal words were

used.21 The target words occurred before a word beginning with a vowel, i.e. where final

20 The screening procedure included a short interview to find out whether the participants produce

a voiced velar obstruent as a stop [g] (Standard Russian) or as a fricative [ɣ] (Southern Russian dialects).

All of the speakers who participated in the experiment produced a voiced velar obstruent as a stop.

21 A recent study (Matsui 2011) has shown that Russian speakers can recover underlying voicing

of devoiced final stops in CVC words.

Page 151: Voicing and voice assimilation in Russian stops

131

131

devoicing is expected ( __# V), and before a word beginning with a voiceless ( __# [vl])

or voiced obstruent ( __# [vd]), i.e., in the environments where voice assimilation is

possible. Thus, there were six stop-vowel combinations (/po/, /to/, /ko/, /bo/, /do/, /go/);

six stop-stop combinations in which the second stop was voiceless (VL) (/pt/, /tp/, /kp/,

/bt/, /dp/, /gp/,), and six stop-stop combinations in which the second stop was voiced

(VD) (/pd/, /tb/, /kb/, /bd/, /db/, /gd/). Therefore, there were six types of clusters: /VL #

VD/, /VD # VD/, /VL # VL/, /VD # VL/, /VL # Vowel/, and /VD # Vowel/. Heterorganic

stops were used to reduce the number of unreleased stops in a cluster. The adjacent words

matched semantically. The list of adjacent words is given in Appendix A(5). In addition

to the 18 phrases, 22 filler phrases were used, which were associated semantically with

the tested collocations, but had various obstruent-sonorant clusters across a word

boundary.

The stimuli were produced in three speaking rate conditions. In the list condition,

the target phrases were pronounced as a list, with a pause between the two words (e.g. luk

(pause) očistili ‘onion <pause> was peeled’). In the slow condition, the target phrases

were pronounced at a comfortable tempo in a carrier phrase Skaži ____ ješče raz (‘Say

____ once again’). The speakers did not have any particular instructions about pauses

between words. In the fast condition, the target phrases were pronounced in the same

carrier phrase quickly. The speakers were asked to repeat the phrase if they paused after a

target word. For each condition, the speakers read the list twice. The readings were

randomized, so the speakers never had two consecutive readings of the same condition.

108 target phrases (18 phrases x 3 conditions x 2 readings) for each speaker were

recorded. 24 tokens from Speaker 2 (the first reading in the fast condition) were

discarded because they were not controlled for pauses. In addition, 25 word-final tokens

in the slow and fast speaking rate conditions were discarded due to the absence of audible

and visible (on a spectrogram) release. These tokens were evenly distributed among all

Page 152: Voicing and voice assimilation in Russian stops

132

132

speakers; the only exception was Speaker 4, who pronounced 12 unreleased stops.

Therefore, a total of 815 target stops were selected for the analysis.

7.4. Measurements

The speakers were digitally recorded in a quiet room using a one-point condenser

SONY ECM-MS907 microphone and an Echo Indigo IO soundcard at 22,050 Hz. The

digitized segments were manually marked for boundaries in PRAAT (Boersma

&Weenink 2011). Both the waveform and the spectrogram were used to set the stop

boundaries. Following Jessen (1998), the beginning of the stop closure was marked at the

end of the second formant structure, which typically coincides with a significant drop in

amplitude of vocal fold vibration. The end of the closure was marked at the beginning of

the release burst. Figure 36 illustrates a waveform of an underlying voiceless word-final

stop before a word beginning with a vowel, where no phonological voicing is expected.

Figure 36. C1 stop closure, voiceless, before a vowel (from a token luk očistili ‘onion was peeled’, spoken by S6 (m), fast rate)

Figure 37 and Figure 38 exemplify cases of juxtaposition of a stop across a word

boundary before voiceless (Figure 37) and voiced (Figure 38) stops. In cases where the

Page 153: Voicing and voice assimilation in Russian stops

133

133

first stop in a cluster had a weak release, the difference in amplitude between two voiced

stops was used to determine the boundary, where possible.

Figure 37. C1 stop closure, voiceless; C2 stop, voiceless (from a token kod podobran, ‘the code is found’, spoken by S6 (m), fast rate.

Figure 38. C1 stop closure, voiced, C2 stop, voiced (from a token lug dokošen ‘the lawn is mown’, spoken by S1 (f), fast rate)

To investigate whether voice assimilation occurs across word boundaries, acoustic

measurements of the voiced and voiceless stops were performed. Closure duration and

duration of voicing of the target stops were measured, using the criteria described in

Chapters 3-6. Voicing ratios were then calculated as a ratio of duration of voicing to

Page 154: Voicing and voice assimilation in Russian stops

134

134

closure duration. Following Slis (1986), phonetic voicing of C1 stops was established

using a cut-off point at the “mean + 2 standard deviations” value for voicing ratio in

underlying voiceless stops in the environment where no voicing is expected, i.e before

vowels and voiceless C2 stops. Therefore, stops with a voicing ratio equal to or lower

than 52% were considered voiceless. Stops with a voicing ratio higher than these

numbers were considered voiced.

7.5. Results

The goal of the analysis was to determine whether voice assimilation takes place

in stop-stop clusters across a word boundary at different speaking rates. If the first stop in

the cluster (C1) is assimilated in voicing, the voicing properties of this segment should be

consistent with the voicing properties of the second stop in the cluster (C2). Before a

word-initial vowel, absence of voicing during closure in C1 stops was interpreted as final

devoicing.

The analysis involved several stages. First, the effect of the speech tempo on

production was investigated to determine whether the speaking rate manipulation had the

intended effect. Next, the voicing properties of C2 were analyzed to establish whether the

segments that determine the results of voice assimilation remain stable. Finally, two

separate analyses of C1 followed. The goal of the first analysis was to assess the degree

of assimilation in C1 in different speaking rate conditions. The goal of the second

analysis was to determine the acoustic properties of final C1 stops.

7.5.1. Segment length and speaking rate

Prior to the analysis of acoustic measurements of C1 and C2 consonants, it was

necessary to establish whether the speakers’ productions changed across speaking rates.

Figure 39 shows the mean duration of target words in the list, slow, and fast speaking rate

conditions. A repeated measures ANOVA with speaking rate (list, slow, fast) as a factor

was performed to assess changes in word duration as a function of speaking rate. The

Page 155: Voicing and voice assimilation in Russian stops

135

135

effect of the rate manipulation was significant (F(2,14)=76.95, p<0.001). The means in

three rate conditions were significantly different from each other (t(7)=8.221, p<0.001;

t(7)=7.025, p<0.001). As expected, target words were pronounced longer in slower

speech and shorter in faster speech.

Figure 39. Mean segment duration in three speaking rate conditions

7.5.2. C2 stops

Voicing of C2 stops was examined using a repeated measures ANOVA with

speaking rate (list, slow, fast) and voice (voiceless, voiced) as factors. The results are

summarized in Figure 40.

As expected, C2 stops retained their underlying specification for voice. A

significant main effect of voice was obtained (F(1,7)=1603.98, p<0.001). Voiced stops

were fully voiced in the list (VR=98%), slow (VR=95%), and fast (VR=93%) rate

conditions; voiceless stops were voiceless (VR=1%) in all speaking rate conditions. An

effect of speaking rate was not obtained (F(2,14)=1.94, p>0.05). The proportion of

voicing during closure was relatively stable across speaking rate conditions, which is

consistent with the results for word-internal clusters (see Ch.3) and clusters across a clitic

boundary (see Ch.4-6).

0

20

40

60

80

100

120

140

160

List Slow Fast

Du

rati

on

(m

s)

Page 156: Voicing and voice assimilation in Russian stops

136

136

Figure 40. Mean VR of C2 stops in three speaking rate conditions.

7.5.3. Effects of speaking rate and following segment on

C1 voicing

Figure 41 shows that evidence for two phonological processes in C1 stops –

Voice Assimilation and Final Devoicing – was found in the data. When the following

segment was a voiced consonant, the underlying voicing contrast in C1 stops was

neutralized and duration of voicing averaged 38 ms. When the following segment was a

voiceless consonant, the underlying voicing contrast in C1 stops was neutralized;

duration of voicing was significantly shorter and averaged 16 ms. Before a vowel,

underlying voiced and voiceless final stops were produced with a short voicing tail

averaging 15 ms.

To assess the effects of these conditions, a repeated measures ANOVA with

speaking rate (list, slow, fast), underlying voice (voiced, voiceless), and following

segment (vowel, voiceless, and voiced) as factors was performed on voicing duration.

Gender (female, male) was added to the model as a between-subject factor. No effect of

underlying voice was found (F<1); the effect of following segment, in contrast, was large

(F(2,12)=179.93, p<0.001). The difference between voiced and voiceless stops across the

following segments was 23 ms. Thus, clearly the dominant influence on voicing in word-

0.0% 0.9% 0.0%

98.2% 95.5% 93.6%

0%

20%

40%

60%

80%

100%

List Slow Fast

Voic

ing

Rati

o (

%) Voiceless

Voiced

Page 157: Voicing and voice assimilation in Russian stops

137

137

final C1 at a phrase level is the following segment and the two phonological processes

assumed to be relevant in the different environments are voice assimilation and word-

final devoicing.

Figure 41. The effect of following segment on duration of voicing in word-final C1stops pooled across all speakers and speaking rates

There was also a significant main effect of speaking rate (F(2,12)=22.63,

p<0.001), and it interacted with following segment (F(4,24)=33.0, p<0.001). No voicing

of final stops was found in the list condition or in any speaking rate before a vowel.

These interactions are explored in more depth in the subsequent analyses. No effect of

Gender was obtained (F<1). Men and women performed identically in this experiment.

Obviously, these results support the claim that underlying laryngeal specifications

in stops in Russian are changed as a function of the following segment. To further

investigate the effects of speaking rate on acoustic properties of stops in cases of voice

assimilation and final devoicing, separate analyses were performed for each process.

7.5.4. Voice assimilation before C2 stops

To assess conditions of voice assimilation, a repeated measures ANOVA was

performed with speaking rate (list, slow, fast), underlying voicing (voiceless, voiced), and

0

10

20

30

40

50

C2 Voiced C2 Voiceless Vowel

Du

rati

on

(m

s)

Voiceless

Voiced

U voice

Voice assimilation Final devoicing

Page 158: Voicing and voice assimilation in Russian stops

138

138

C2 voicing (voiceless, voiced) as factors. Significant main effects of speaking rate

(F(2,14)=33.49, p<0.001) and C2 voicing (F(1,7)=376.8, p<0.001), as well as interaction

(F(2,14)=32.79, p<0.001) were obtained, but no effect of underlying voicing (F<1) was

found. Figure 42 shows that voice assimilation occurs before voiced and voiceless C2

stops in slow and fast speech. In the list condition, C1 stops were generally pronounced

as voiceless before an obligatory pause, with voicing during 18% of stop closure.

Figure 42. The effects of C2 voice and speaking rate on voicing in word-final C1 stops pooled across all speakers.

Voice assimilation in stop-stop clusters across a word boundary in Russian is

determined by the second stop in a cluster. When assimilation occurs, the first stop takes

on the laryngeal specification of the second stop. The result that there is no effect of

underlying voice strongly suggests that C1 stops are pronounced as voiceless before

voiceless C2 stops (VR averaged 24.7%) and as voiced before voiced C2 stops (VR

averaged 94.5% in fast speech and 73.3% in slow speech).

However, before reaching this conclusion, there are two questions that must be

addressed. First, is there a gradient modification of voicing during closure or is this due

to a distribution of pauses? The difference between VR of underlying voiced word-final

0

10

20

30

40

50

60

Voiceless Voiced Voiceless Voiced Voiceless Voiced

List Slow Fast

Du

rati

on

(m

s)

Voiceless

Voiced

U voice

C2:

Rate:

Page 159: Voicing and voice assimilation in Russian stops

139

139

stops in the slow and fast rate conditions allows for two different interpretations. The

“categorical” interpretation suggests that speakers may fail to assimilate some C1 stops in

slow speech if they pause between C1 and C2 and, therefore, produce such final C1s as

voiceless. The alternative interpretation would explain this difference as a gradient

decrease of voicing during closure in slow speech.

Figure 43 presents distributions of voicing durations in C1 stops in the

environments where assimilation was found. The distribution of voicing in tokens before

voiced C2 shows that voice assimilation occurs in 97% of clusters in fast speech (Figure

43b) when there is no pause between words. Without control for pauses in slow speech,

speakers produce only 83% of clusters with voice assimilation and 17% of tokens before

voiced C2 s were produced as voiceless (Figure 43a). Voice assimilation was categorical,

but some C1 tokens were not assimilated in this environment.

Figure 43. Distribution of voicing durations in word-final C1 stops in (a) slow and (b) fast speech, pooled across eight speakers.

The other question arises because of the results reported in Chapter 3, where it is

shown that the voicing contrast in stops is not completely neutralized in assimilation and

word-final contexts. Traces of underlying voicing were found in the duration of a

0

10

20

30

40

50

60

0 10 20 30 40 50 60 70 80 90 100110120

# o

f T

ok

ens

Voicing duration bin (ms)

a. Slow

Voiced

Voiceless

C2 voice

0

10

20

30

40

50

60

0 10 20 30 40 50 60 70 80 90 100110120

# o

f T

ok

ens

Voicing duration bin (ms)

b. Fast

Voiced

Voiceless

C2 voice

Page 160: Voicing and voice assimilation in Russian stops

140

140

preceding vowel. Other studies of voicing in word-final stops at the phrase level (e.g.

Slowiaczek and Dinnsen 1985) also report similar traces in closure duration and/or in

duration of voicing into closure. Thus, the following cues associated with voicing – stop

closure duration, voicing duration, and duration of a preceding vowel – were examined

separately.

7.5.4.1. Closure duration

Duration of stop closure of assimilated C1 tokens was analyzed using a repeated

measures ANOVA with speaking rate (slow, fast), underlying voicing (voiced, voiceless),

and C2 voicing (voiced, voiceless) as factors. A significant main effect of speaking rate

(F(1,7)=54.17, p<0.001) was obtained. As expected, speakers pronounced shorter

consonants in fast speech (Fast: M=53 ms, SD=14; Slow: M=65 ms, SD=17).

The effect of C2 voicing was significant (F(1,7)=5.43, p=0.05). The duration of

stop closure was shorter in voiced clusters (M=57 ms, SD=16) than in voiceless clusters

(M=74 ms, SD=26).

Figure 44. Effects of C2 voicing and speaking rate on closure duration (a) and voicing (b) in word-final C1 stops.

0

20

40

60

80

C2 Voiced C2 Voiceless

Du

rati

on

(m

s)

a. Closure

Voiced

Voiceless

0

20

40

60

80

Slow Fast Slow Fast

Voiced Voiceless

Du

rati

on

(m

s)

b. Voicing

Voiceless

Voiced

C2:

Rate:

U voice

Page 161: Voicing and voice assimilation in Russian stops

141

141

A main effect of underlying voicing (F(1,7)=2.86, p=0.135) and interactions

(F<1) were not obtained. The duration of stop closure was slightly shorter in underlying

voiced stops (M=57 ms, SD=23) than in underlying voiceless stops (M=70 ms, SD=26),

but the difference was not significant, as shown in Figure 44a.

7.5.4.2. Voicing duration

The analysis of voicing duration in assimilated C1 stops (repeated measures

ANOVA) did not reveal a significant effect of underlying voicing (F<1) or interaction

(F<1). The underlying contrast in voicing was completely neutralized in stop clusters

across a word boundary (see Figure 44b).

Significant main effects of speaking rate (F(1,7)=37.52, p<0.001) and C2 voicing

were obtained (F(1,7)=348.9, p<0.001), and there was an interaction (F(1,7)=28.48,

p<0.001). Stops had significantly longer voicing during closure before voiced C2s (M=56

ms, SD=16) than before voiceless C2s (16 ms, SD=11). The amount of voicing changed

across speaking rates; however, voicing duration does not necessarily increase in every

segment. Rather, the relationship between rate and voicing duration is different in

phonetically voiced and voiceless stops. No effect of rate was found in voiceless stops

(F<1). Speakers produced a short voicing tail averaging 16 ms in slow and fast

conditions. The duration of voicing in voiced stops, in contrast, was significantly longer

in slow speech (M=62 ms, SD=17) than in fast speech (M=49 ms, SD=12; F(1,7)=40.98,

p<0.001).

7.5.4.3. Duration of a preceding vowel

The same statistical test, applied to duration of a preceding vowel, yielded neither

an effect of underlying voice (F<1), nor the effect of C2 voicing (F(1,7)=2.90, p=0.132),

or interaction (F(1,7)=3.49, p=0.107). Vowels were slightly longer before underlying

voiced stops but the difference (2 ms) was not significant.

Page 162: Voicing and voice assimilation in Russian stops

142

142

Only speaking rate affected vowel duration (F(1,7)=9.22, p<0.05). Vowels were

longer in slow speech (M=107 ms, SD=33) than in fast speech (M=83 ms, SD=26).

7.5.4.4. Interim conclusion

The results suggest that voice assimilation in stop-stop clusters across a word

boundary in connected speech in Russian results in fairly complete neutralization of the

voicing contrast. Speakers showed a strong tendency to assimilate word-final stops in

voicing to the following stop. This process is fairly complete in fast speech, but 17% of

word-final stops in slow speech were not assimilated. All word-final stops were devoiced

before a pause in the list condition. The next section presents the results of the analysis of

final devoicing at phrase level.

7.5.5. Final devoicing

Final devoicing occurred in two cases: 1) speakers produced devoiced stops

before a word beginning with a vowel in all speaking rate conditions, and 2) speakers

devoiced stops in all environments before a pause in the list reading. Two different

analyses were performed for each case to examine acoustic parameters of final devoicing.

7.5.5.1. Final devoicing in stops before a vowel

To examine acoustic properties of final stops at phrase level before a word

beginning with a vowel, separate repeated measures ANOVAs with underlying voice

(voiceless, voiced) and speaking rate (list, slow, fast) as factors were performed for each

cue.

For the duration of stop closure, a significant main effect of speaking rate was

found (F(2,14) =31.97, p<0.001): speakers produced longer stops in the list condition. No

main effect of underlying voice was obtained (F(1,7)=3.46, p=0.105). The interaction

with rate (F(1,7)=6.63, p<0.01) revealed that stop closure was shorter in underlying

voiced stops in the list condition (Vd: M=97 ms, SD=25; Vl: M=103 ms, SD=25;

Page 163: Voicing and voice assimilation in Russian stops

143

143

F(1,7)=4.26, p=0.05) and in the slow condition (Vd: M=77 ms, SD=18; Vl: M=82 ms,

SD=18; F(1,7)=6.96, p<0.05), but this difference was neutralized in fast speech (M=61

ms, SD=13; F<1).

When the same analysis was applied to voicing duration, it did not yield main

effects of underlying voice, speaking rate, or interaction (F<1). As shown in Figure 45,

speakers tended to produce a roughly stable amount of voicing (M=15 ms, SD=9) in

devoiced word-final stops followed by a vowel in all speaking rates, following the pattern

observed in all voiceless stops.

Figure 45. Differences in duration of voicing in devoiced final stops before a word beginning with a vowel.

The analysis of vowel duration (see Figure 46) revealed an effect of underlying

voicing (F(1,7)=17.05, p<0.01). The underlying contrast was not neutralized and vowels

were longer before underlying voiced stops (Vd: M=110 ms, SD=42; Vl: M=104 ms,

SD=41). As expected, rate affected vowel duration (F(1,7)=12.72, p<0.01). Duration of a

vowel was significantly different in the three condition (List: M=138 ms, SD=40; Slow:

M=102 ms, SD=32; Fast: M=77 ms, SD=24).

0

10

20

30

40

50

List Slow Fast

Du

rati

on

(m

s)

Voiceless

Voiced

Page 164: Voicing and voice assimilation in Russian stops

144

144

Figure 46. Effect of underlying voicing on the duration of a preceding vowel before final stops before a word beginning with a vowel.

7.5.5.2. Final devoicing in the list condition

To assess the effect of following segment on closure duration and voicing

duration of devoiced final stops in slow speech, a two-way repeated measures ANOVA

was performed on each cue with underlying voice (voiced, voiceless), and following

segment (voiceless, voiced, vowel) as factors. For closure duration, the test revealed a

marginal main effect of underlying voice (F(1,7)=4.78, p=0.060). Stop closure was

slightly longer in underlying voiceless stops (M=102 ms, SD=30) than in voiced stops

(M=97 ms, SD=24). Following segment affected closure duration (F(2,14)=6.42,

p<0.01). Stops were shorter when they occurred before a voiceless C2 (M=96 ms,

SD=26) and longer before a voiced C2 (M=102 ms, SD=30) and a vowel (M=100 ms,

SD=25); the difference between the latter was not significant (t(7)=1.18, p=0.278).

For voicing duration, a significant main effect of underlying voice was also found

(F(1,7)=7.81, p<0.05), but no effect of following segment (F<1) was obtained or

interaction (F<1). Mean voicing into closure was longer in underlying voiced stops

(M=17 ms, SD=10) than in underlying voiceless stops (14 ms, SD=8) in all

environments, as shown in Figure 47.

*

*

*

0

40

80

120

160

List Slow Fast

Du

rati

on

(m

s)

Voiceless

Voiced

Page 165: Voicing and voice assimilation in Russian stops

145

145

Figure 47. Effects of following segment and underlying voice on voicing duration in devoiced final stops in the list condition.

For duration of a preceding vowel, a main effect of underlying voice was obtained

(F(1,7)=5.59, p<0.05). Vowel duration was longer before underlying voiced stops

(M=137 ms, SD=38) than before underlying voiceless stops (M=131 ms, SD=38), as

shown in Figure 48.

Figure 48. Effects of following segment and underlying voice on duration of a preceding vowel before devoiced final stops in the list condition.

The effect of following segment was significant (F(2,14)=5.47, p<0.5). A

preceding vowel was longer before final stops followed by a vowel (M=139 ms, SD=40;

* *

0

10

20

30

40

C2 Voiced C2 Voiceless Vowel

Du

rati

on

(m

s)

Voiceless

Voiced

* * *

0

30

60

90

120

150

180

C2 Voiced C2 Voiceless Vowel

Du

rati

on

(m

s)

Voiceless

Voiced

Page 166: Voicing and voice assimilation in Russian stops

146

146

t(7)=2.75, p<0.05); vowel duration before stop clusters was slightly longer when a vowel

was followed by a voiced stop, but the difference did not reach significance level (Vd:

M=133 ms, SD=38; Vl: M=131 ms, SD=38; t(7)=1.33, p=0.226).

7.5.5.3. Interim conclusion

The results show that speakers do not completely neutralize the underlying

voicing contrast in terms of some cues in both devoicing conditions: before a word

beginning with a vowel at all speaking rates and before a pause in the list reading. They

produced underlying voiced stops with shorter closure and longer voicing during closure.

The underlying difference was not completely neutralized in duration of a preceding

vowel, either, which was overall longer before underlying voiced stops. In this

experiment, like in the experiment reported in Chapter 3, there was no complete

neutralization of the underlying voicing contrast. However, more traces of underlying

voicing were observed.

7.6. Discussion and conclusion

The goal of this experiment was to examine voicing properties of word-final stops

at the phrase level. The results provide evidence for two processes: Final devoicing and

Voice assimilation. Each process is controlled by a segment that follows the final stop.

Devoicing is found in stops followed by a vowel or a pause, voice assimilation is found

before voiced and voiceless obstruents. The results support the view that voice

assimilation overrides final devoicing when a final stop occurs in a cluster before a

voiced stop. However, the results of this experiment provide new information about

voicing in Russian word-final stops.

1. Final devoicing. Recall that the results for devoiced word-final stops, discussed

in Chapter 3, showed that the underlying voicing contrast is not completely neutralized:

duration of a preceding vowel and F1 frequency is different in underlying voiced and

voiceless stops. Neutralization was found, in contrast, in closure duration and duration of

Page 167: Voicing and voice assimilation in Russian stops

147

147

voicing. In the experiment reported in this chapter, which had different speakers and

target words, incomplete neutralization was also found in closure duration and duration

of voicing in the list condition.

Speaking rate was an important factor that affected the voicing contrast in final

stops. Faster speech facilitated the neutralization. In the fast speech condition, significant

underlying differences were found only for a duration preceding vowel; differences in

stop closure and voicing into closure were neutralized in fast speech. In the list condition,

the contrast was found in all cues. Speakers did not completely neutralize final stops in

CVC words in closure duration, duration of voicing into closure, and in duration of a

vowel.

2. Voice assimilation. Final stops in clusters across a word boundary showed

evidence of voice assimilation. A strong effect of C2 on voicing in C1 stops was found.

Following the pattern observed in word-internal assimilated stops, the underlying voicing

contrast was neutralized in terms of duration of voicing and closure duration in all

clusters across a word boundary. Phonetic voicing in voiced assimilated stops was

affected by speaking rate, which is consistent with the pattern found earlier for voiced

stops in this study.

However, some parameters of voicing in clusters across a word boundary differed

from word-internal clusters. The effect of C2 voicing on some cues was not as strong in

clusters across a word boundary as in word-internal clusters. Specifically, voicing during

closure in C1 was affected by C2 voicing, which means that clusters were either

completely voiceless or completely voiced as a function of the C2. However, closure

duration of word-final stops in clusters and duration of a preceding vowel were not

affected by C2 voicing in this experiment.

Variation in results for different cues in final stops is, nevertheless, consistent

with the pattern for word-final stops discussed in Slowiaczek and Dinnsen (1985). They

point out that different speakers left traces of the voicing contrast in different cues. The

Page 168: Voicing and voice assimilation in Russian stops

148

148

same tendency was found in this study. One pool of speakers (Experiment 1, Chapter 3)

tended to neutralize closure duration and voicing duration of devoiced final stops but

retained the voicing contrast on a preceding vowel and F1. The other pool of speakers

(Experiment 5, this chapter) did not neutralize devoiced final stops in closure duration,

duration of voicing, and duration of vowel in the list condition, but they neutralized the

underlying contrast in terms of closure duration and duration of voicing in fast speech.

Page 169: Voicing and voice assimilation in Russian stops

149

149

CHAPTER VIII

SUMMARY OF RESULTS AND IMPLICATIONS

The goals of this study were to investigate the acoustic cues to the voicing

contrast in Russian stops and to examine the implementation of voicing with particular

reference to voicing during closure in intervocalic stops and in stop clusters. Closure

duration, duration of a preceding vowel, f0, and F1 were also investigated to determine

the acoustic correlates of the voicing contrast in Russian. Each cue was tested to

determine if there was a significant difference between voiced and voiceless categories,

and then the cues were tested for their ability to predict a voiced or a voiceless member of

a category.

In addition, duration of voicing in stops in intervocalic position was examined for

effects of speaking rate to determine whether voicing during closure in voiced stops

changes as a function of speaking rate in a fashion similar to the VOT of initial stops in

other true voice languages. The effect of rate on voicing was interpreted as evidence for

phonological specification with the feature [voice].

One special focus in this study was on voicing in stops in prepositions. Some

claims in the literature have suggested that voicing processes in prepositions are not

regular, that is there is incomplete voice assimilation in obstruent clusters or there is

devoicing of voiced obstruents in prepositions. It is also claimed that there is a type of

assimilation in stops in prepositions that is extremely rare across languages and

typologically implausible, that is voice assimilation to a following obstruent through an

intervening sonorant consonant. This claim, however, has been questioned. To determine

whether voice assimilation is present or absent in stops, cues for voicing in C1 stops in

prepositions were examined for effects of C2 obstruent. As voice assimilation is triggered

by the rightmost (C2) obstruent in a cluster, only the effect of the C2 obstruent can

unambiguously indicate voice assimilation in C1 stops.

Page 170: Voicing and voice assimilation in Russian stops

150

150

The following are the main findings of the experiments in this study:

(1) an effect of speaking rate was found for voicing during closure in

voiced stops in intervocalic position; no effect of rate was found for

voicing or VOT in voiceless stops;

(2) no complete neutralization of the underlying contrast was found for

devoiced final stops;

(3) no complete neutralization of the underlying contrast was found for

assimilated stops in morpheme-internal clusters;

(4) no complete neutralization of the underlying contrast was found for

assimilated stops in clusters across a word boundary;

(5) no voice assimilation (i.e. the effect of C2 obstruent) was found in

obstruent-sonorant-obstruent clusters; rather, greater variation in

voicing was found in stops in prepositions, compared with word-

internal stops.

The implications of these findings for phonological theory are discussed in the

next sections.

8.1. Effect of speaking rate on voicing

In line with studies of rate effects on VOT in initial stops (Kessinger and

Blumstein 1997; Magloire and Green 1999, Solé and Estebas 2000, Beckman et al 2011),

an effect of speaking rate was found on voicing during closure in intervocalic stops. The

results of the study show that voicing during closure changed as a function of speaking

rate only in voiced stops. No effect of rate for voicing into closure, or VOT was found in

voiceless stops.

Studies of effects of rate on VOT have shown that speakers produce shorter

prevoicing (in languages like French or Thai) or aspiration (in English or Thai) in initial

stops in faster speech. The durations of prevoicing or aspiration increase in slower

Page 171: Voicing and voice assimilation in Russian stops

151

151

speech. It is noteworthy that no change is found in the short lag VOTs of voiceless

unaspirated stops. Short lag VOT in different languages (e.g. in Spanish, English, French,

or Thai) remains relatively stable in different rate conditions. Beckman et al (2011) argue

that VOT changes as a function of speaking rate in stops specified with phonological

features. Prevoicing in stops specified with [voice] and aspiration in stops specified with

[spread glottis] exhibit similar lengthening in slower speech. No change across speaking

rates in voiceless unaspirated stops is a property of stops assumed to be unspecified for

laryngeal features (see also Honeybone 2005 for diachronic arguments for unspecified

stops).

The results of this study support the claim that voicing changes as a function of

speaking rate only in phonologically specified stops. The theory suggested by Beckman

et al (2011) explains the asymmetric changes in VOT of initial stops. The results of this

study suggest that the model can be extended to include changes in voicing during

closure in voiced stops in other positions within a word. Indeed, the feature [voice] is

specified not only on stops in utterance-initial position, but also on intervocalic stops and

stops in clusters. The effect of rate was found in Russian (surface-)voiced stops, which

are assumed to be phonologically specified with the feature [voice]. The duration of

voicing in these stops changed at different speaking rates. Voicing was the longest in the

list condition, averaging 73 ms, and the shortest in the fast condition, averaging 46 ms.

Temporal cues in intervocalic voiceless stops, which are assumed to be

phonologically unspecified for a laryngeal feature, were not affected by speaking rate.

The VOT durations of voiceless intervocalic stops remained relatively stable (23 ms)

across all speaking rates, in line with the pattern found for short-lag VOT in initial stops

(Kessinger and Blumstein 1997; Magloire and Green 1999, Solé and Estebas 2000). In

addition, no effect of speaking rate was found on the durations of a short voicing tail into

closure in voiceless stops. It remained relatively stable in all speaking rates and averaged

15 ms across different environments.

Page 172: Voicing and voice assimilation in Russian stops

152

152

Although vocal fold vibration may not be the only cue for the [voice] contrast (see

Ladefoged 1967, Kingston and Diel 1994, Kong 2009 for discussion), it is apparently the

most important cue in Russian. It is an invariant cue that shows the greatest effect of the

underlying voicing contrast and it is the best predictor of a voicing category in stops in all

positions. Other significant cues can be viewed as consequences of gestures to ensure and

facilitate active vocal fold vibration. Consistent lower values of F1 before and after

voiced stops in Russian, observed in word-initial and word-medial positions, strongly

suggest that vocal fold vibration is accompanied by a larynx lowering gesture. Lower f0

values, consistently found after release of voiced stops, suggest that voiced stops are

pronounced with slacking of the vocal folds. The voicing gestures result in little variation

in voiced stops. Speakers tended to voice stops during their entire closure, which, on

average, was voiced for 98.8% of a stop’s duration; voicing was unbroken in 95.9% of

word-internal tokens.

Voiceless stops in Russian are produced without vocal fold vibration for the

majority of the stop closure, and the amount of voicing during closure does not change

across speaking rates. The findings show that speakers produce a short and relatively

stable amount of voicing in voiceless stops, which does not change as a function of

speaking rate, just as the short lag VOT in these stops is stable at different speaking rates.

8.2. Incomplete neutralization in cases of voice assimilation

and final devoicing

8.2.1. Results of the current study

Voice assimilation and final devoicing do not result in complete neutralization of

the underlying laryngeal contrast in stops. Traces of the underlying voicing contrast were

consistently found in the duration of a preceding vowel for cases of devoiced word-final

stops. Speakers consistently produced longer vowels before underlying voiced stops.

Although the actual difference was small (7 ms across all conditions in Experiments 1

Page 173: Voicing and voice assimilation in Russian stops

153

153

and 5), it was, nevertheless, found in all speakers. The results unequivocally indicate that

speakers produce differences between underlying voiced and voiceless word-final stops

and do not completely neutralize the voicing contrast.

The results of this study show that traces of underlying voicing are also found in

the duration of a preceding vowel for assimilated stops in stop clusters. This pattern is

found in word-internal stops, as well as in prepositions. The duration of a vowel before a

C1 stop in a cluster changed as a function of C2 voice, as well as a function of underlying

voice. In other words, speakers pronounced longer vowels before voiced clusters, but the

vowels were longer before underlying /d/ than before underlying /t/. The same tendency

was observed in voiceless clusters. The vowels in general were shorter before

phonetically voiceless assimilated stops, but speakers distinguished between underlying

/d/ and /t/ and produced a longer vowel before underlying /d/s.

Another cue that left traces of underlying voicing in assimilated and devoiced

stops is F1. Speakers consistently produced lower F1 before underlying voiced stops in

the same way as they produce longer vowel duration in this position. This unambiguously

indicates that speakers do not completely neutralize the underlying contrast in assimilated

stops.

8.2.2. Results of the current study and previous studies

Speakers in this study varied in terms of which cues leave traces of the underlying

voicing contrast. All cues (voicing, closure duration, duration of vowel, and F1) showed

traces of underlying voicing, but in each specific condition, traces were observed only in

one or two cues but never in all cues. Speakers tended to preserve the voicing contrast in

terms of duration of a preceding vowel and F1 values, but largely neutralized the contrast

in closure duration and duration of voicing. Other studies that have investigated

neutralization in devoiced stops (e.g. Slowiaczek and Dinnsen (1985), Jassem and Richter

(1989) for Polish; Dinnsen and Charles-Luce (1984), Charles-Luce and Dinnsen (1987)

Page 174: Voicing and voice assimilation in Russian stops

154

154

for Catalan; Pye (1986), Barry (1988), and Dmitrieva et al (2010) for Russian; Warner et

al (2004) for Dutch) also report incomplete neutralization, but not all cues showed traces

of underlying voice contrasts.

In Slowiaczek and Dinnsen’s (1985) experiment on Polish final obstruents in

monosyllabic minimal pairs produced by 5 speakers, traces of the underlying voicing

were found in vowel duration, in duration of voicing of the bilabial stop /b/, and for some

speakers, in closure duration. Jassem and Richter (1989), in contrast, argue that

incomplete neutralization, found in Slowiaczek and Dinnsen (1985), is an artifact of

reading the word list. They designed an experiment in which speakers were prompted to

produce words with final obstruents without seeing an orthographic form. They report the

results for 4 speakers and a non-significant trend toward longer vowels and shorter stop

closure for underlying voiced stops.

Barry (1988) investigated laryngeal neutralization in final stops in Russian

produced by 8 speakers. Her findings suggest that the difference in vowel duration (4%)

was observed in 6 speakers, and the difference in closure duration (6%) was observed in

4 speakers, but it did not reach significance level, possibly due to a small pool of subjects

(8) and tokens (11).

Dmitrieva et al (2010) investigated the effect of L2 on production of final

devoicing in Russian. For the four monolingual speakers of Russian, a significant

difference in closure duration (16 ms) and release (16 ms) of devoiced final stops was

found. The differences in vowel duration (2 ms) and duration of voicing into closure (1

ms) were not significant. The results in Pye (1986) must be used with caution because 4

out of 5 speakers in her study were Russian-English bilinguals. As shown in Dmitrieva et

al (2010), incomplete neutralization of the laryngeal contrast is more robust in Russian-

English bilinguals due to the influence of L2. Pye found significant differences in vowel

duration (16 ms), stop closure (12 ms), and duration of voicing into closure (22 ms), but

the differences were considerably larger in bilingual speakers. For the monolingual

Page 175: Voicing and voice assimilation in Russian stops

155

155

Russian speaker in Pye’s (1986) study, incomplete neutralization was observed in vowel

duration (11 ms); the contrast was neutralized in terms of stop closure duration and

duration of voicing.

Warner et al (2004) found incomplete neutralization in devoiced final stops in

Dutch, another true voice language. 15 speakers produced small but significant

differences in vowel duration, but the differences in closure duration and voicing duration

were largely neutralized. In addition, speakers showed an ability to use small differences

in vowel duration to distinguish between underlying voiced and voiceless final stops in

perception. Similar findings were shown in the perception study in Matsui (2011) for

speakers of Russian.

The results in this study, as well as the results of previous studies, show two

things. First, there is considerable variation in speakers in terms of which cues show

traces of underlying voicing. As Slowiaczek and Dinnsen (1985: 336) point out, only

some speakers (one subject in their pool) show incomplete neutralization of the contrast

in all cues. This probably explains why many researchers believe that the pattern in final

stops is neutralization whereas the results suggest that the pattern is, in fact, in incomplete

neutralization.

Second, the effects of underlying voicing on cues in these cases are very small,

usually falling in the range between 3 and 15 ms. The precision of measurements in this

case can affect the results. Jassem and Richter (1989), who argued that neutralization is

complete in Polish, report that the accuracy of their measurements was within 10 ms

whereas the relevant differences which they tested statistically were around 5 ms.

Apparently, with more precise measurements the results could reach significance. In

addition, a large pool of subjects and tokens is needed to capture the significance of these

effects. Most studies that have reported that neutralization in cues was complete (e.g.

Barry 1987, Jassem and Richter 1989) had 8 speakers or fewer. The differences observed

in these studies were probably not significant due to a small pool of subjects. The authors

Page 176: Voicing and voice assimilation in Russian stops

156

156

who report significant differences and incomplete neutralization in some cues (e.g.

Slowiaczek and Dinnsen 1985, Warner et al 2004) had 8 speakers or more. Even the

difference between 6 and 8 subjects can affect the results of a statistical test. 14 speakers

were used in the experiments reported in Chapters 3-6, and significant differences were

found in some (but not all) cues.

8.2.3. Implications for phonology

The findings of incomplete neutralization in this study, as well as similar findings

in other studies, present a problem for current views about how phonetics and phonology

interact. The view which is shared by most generative phonologists can be traced back to

the model proposed in Chomsky and Halle’s Sound Pattern of English (1968), in which

phonological rules precede phonetic rules. The major problem with this model is that

segments somehow must “inherit” their underlying voicing category after it is changed by

phonological processes, so that different implementations of voicing is possible in the

phonetics. Assuming that voice assimilation is phonological and lengthening of a vowel

before a voiced consonant is phonetic, when an underlying voiceless stop (e.g. /t/) is

assimilated by a phonological rule22 and becomes voiced, it has the same phonological

specification as an underlying /d/ that was not changed by a rule. Both segments would

be specified as [voice], and hence both should be treated in the same way in the

phonetics.

The experimental results, however, show that this cannot be right. Speakers

produce a difference between an underlying /d/ and an underlying /t/ that is assimilated in

the phonology. Phonetic implementation of these segments is apparently different.

22 The same problem arises in Optimality Theory. If there is a constraint that states that a vowel is

long before a voiced stop and another that requires that all obstruents in a cluster agree in voice, it is

unclear how a vowel before an underlying voiced stop could differ in length from a vowel before a stop that

was underlyingly voiced.

Page 177: Voicing and voice assimilation in Russian stops

157

157

Although both stops are fully voiced and have similar closure durations, voiced segments

resulting from underlying /d/s are, on average, produced with lower F1 and a longer

preceding vowel than voiced segments resulting from underlying /t/s. This means that

phonetic implementation of the phonological feature [voice] somehow must take into

account what the underlying status of a segment is. The same trend was observed in

devoiced final stops. Speakers distinguished between underlying voicing in devoiced

stops when they produced a longer vowel and lower F1 before underlying /b, d, g/ than

before underlying /p, t, k/. Both categories should be phonologically unspecified after

final devoicing applies and, thus, indistinguishable in phonetics if phonology precedes

phonetics.

Slowiaczek and Dinnsen (1985), who obtained similar results for devoiced word-

final stops in Polish in a subset of cases considered here, discuss how such incomplete

neutralization might be accounted for. According to them (pp. 338-339), evidence of

incomplete neutralization in cases of final devoicing in Polish and Catalan (see also

Dinnsen and Charles-Luce 1984 and Charles-Luce and Dinnsen 1987 for discussion of

the Catalan case) can be accounted for by allowing some kind of interaction between

phonology and phonetics. Under their approach, lengthening of a vowel before voiced

stops must occur before phonological assimilation or devoicing. After stops are

assimilated in the phonology, another lengthening occurs, which ensures that a vowel is

longer before all phonetically voiced stops. This order correctly describes that a vowel is

longer before a [d] resulting from an underlying /d/ and shorter before an assimilated [d]

resulting from an underlying /t/, and vowels are longer before voiced clusters than before

voiceless clusters. The other ordering of processes (i.e. phonological assimilation before

phonetic vowel shortening) makes two types of [d]s indistinguishable in phonetics.

Slowiaczek and Dinnsen argue that such an adaptation of the existing model is plausible

since “there is nothing in principle which excludes the possibility of phonetic

implementation rules applying before phonological rules” (p.339).

Page 178: Voicing and voice assimilation in Russian stops

158

158

In some phonological theories, e.g. Lexical Phonology (Kipsarsky 1985) or

Derivational OT (Rubach 2008), Final Devoicing and Regressive Voice Assimilation are

applied at late stages of a derivation. Thus, it is possible to assume that phonetic

implementation rules might access not underlying representations but, rather,

intermediate input representations to the component in Lexical Phonology or to levels 2

(word level) or 3 (phrase level) in Derivational OT. Nonetheless, even on this view of the

organization of phonology, we need phonetic processes before some phonological

processes.

Additional evidence for the interaction between phonology and phonetics can be

found in models of speech perception and spoken word recognition, e.g. the TRACE

model of word recognition (McClelland and Elman 1986) or models of cascading

activation of speech production (e.g. Dell 1986, Dell and O’Seaghdha 1991). The models

employ access to higher levels (phonological and lexical) and obligatory feedback loop

between lower, phonetic, levels and higher levels. The models reach an outcome

comparable with the natural outcome of human subjects only when they utilize activation

and access to all available higher levels.

In line with these theories, McQueen and Cutler (1997) and Boersma (2006)

introduce additional module(s) with access to phonology, which account(s) for perception

and production of lexical items. Boersma (2006) argues for four forms in a model of

phonology-phonetics interface: the underlying form (UF), which is traditionally placed in

the realm of phonology, and three phonetic forms: abstract surface form (SF),

traditionally placed in the domain of phonetics, auditory form (AudF), and articulatory

form (ArtF).23 Boersma claims that both UF and SF independently map on AudF and

23 Boersma (2005) argues that independent evidence for two separate forms comes from the fact

that 9 month-old children can perceive speech sounds without realizing how to pronounce them (see

Jusczyk 1997 for details).

Page 179: Voicing and voice assimilation in Russian stops

159

159

ArtF. This double mapping can, in theory, describe how both effects of underlying

voicing and phonetic voicing are found in assimilated and devoiced stops. Although a

detailed mechanism of such mapping is not clear at this point, Goldrick and Blumstein’s

(2006) study suggests it might result from cascading activation of several competing

representations.

Goldrick and Blumstein analyzed speakers’ production of erroneous “voiced” and

“voiceless”24 English stops in tongue-twisters. They found that these segments had

acoustic properties of both intended and correctly pronounced segments. In the errors

analyzed in this study, acoustic properties directly associated with the initial stop (e.g.

VOT or burst duration) were consistent with correct target stops that matched to the error.

Thus, VOTs of “k” [g] errors were closer to correct [g] tokens than to correct [k]

tokens). But secondary cues (e.g. vowel duration) intriguingly patterned with intended

segments: vowels following “k” [g] errors had similar duration to vowels following

correct [k]s. Goldrick and Blumstein suggest that presence of traces of underlying

voicing in vowel length supports evidence for cascading activation and competition

between several representations. They argue that these traces are possible only if there is

partial activation of a competing representation. These findings are consistent with the

results of this study, in which traces of underlying voicing in assimilated and devoiced

stops were also found for vowel duration.

Other solutions discussed by Slowiaczek and Dinnsen are less plausible. One

option is to claim that there is no phonological final devoicing, and difference in phonetic

implementation of devoiced final obstruents is merely allophonic variation. Voiced stops

under this theory would have two allophones: a fully voiced allophone would occur in

initial and intervocalic position and a (partially) devoiced allophone would occur word-

24 The labels “voiced” and “voiceless” for English stops represented in spelling as “g” and “k”

reflect the authors’ use based on a long-standing tradition.

Page 180: Voicing and voice assimilation in Russian stops

160

160

finally. Counterevidence to this solution is found in this study. While allophonic variation

could technically be a solution in cases of final devoicing, it does not work for cases of

voice assimilation. No variation was found in voicing and closure duration for

assimilated stops in clusters, which eliminates allophonic variation as an option.

Incomplete neutralization was found, instead, in duration of a preceding vowel, which

suggests that the contrast was preserved by means other than allophonic variation.

Finally, Slowiaczek and Dinnsen argue against a solution which introduces a feature

different from [voice], e.g. [±tense].

Some authors, nevertheless, question whether neutralization of the voicing

contrast in final devoicing is incomplete. Iverson and Salmons (2011) argue that final

devoicing is a complete phonological process and incomplete neutralization in some cues

found in different studies is an effect of other, task-specific factors that can influence

perception and production. One such factor is orthography. Speakers can be affected by

spelling when they read a word list (see Fourakis and Iverson 1984 for German; Warner

et al 2004, 2006 for Dutch). When speakers were asked to produce words that were not

presented to them in spelling, no traces of underlying voicing were found in final stops.

The results of this study, however, suggest that influence of orthography cannot fully

account for incomplete neutralization in Russian. Apparently, orthography has some

impact on production in the list reading condition. But in our study incomplete

neutralization was also found in the fast rate condition. Speakers still produced longer

vowels before underlying voiced stops in clusters in fast speech.

Another factor that must be considered is lexical contrast. It has been shown that

words with minimal pairs can be more resistant to phonological alternations to facilitate

lexical access in dense neighborhoods (see Wedel 2002, Ussishkin and Wedel 2009 for

details). Snoeren et al (2006) showed a small trend toward incomplete neutralization in

words that are minimal pairs in cases of voice assimilation across a word boundary in

French. The results of our study suggest that speakers do tend to leave more traces of

Page 181: Voicing and voice assimilation in Russian stops

161

161

underlying voicing in final stops in short words that are minimal pairs (Chapter 7 of this

study) than in disyllabic non-minimal pairs (Chapter 3 ibid.).

8.3. Voicing in prepositions

The results of Experiment 3 on voice assimilation in prepositions provide

evidence that so-called “sonorant transparency” to voice assimilation is not a

phonological rule of fast speech in Russian, as it has been claimed by Hayes (1984). No

effect of a second obstruent (C2) on voicing in preceding stops was found when a

sonorant consonant intervenes. Presonorant C1 stops in obstruent-sonorant-obstruent

clusters do not change their underlying voicing, just as they do not in other presonorant

positions. No effect of C2 voicing was obtained for other cues (e.g. vowel duration),

either. Changes in duration of voicing were found in some stops (14% of all tokens), but

these changes cannot be interpreted as “sonorant transparency”. The statistical tests

unambiguously indicated that these changes were not triggered by the rightmost (C2)

obstruent in a cluster. Variation in voicing duration in obstruent-sonorant-obstruent

clusters was sometimes observed not only in voiceless C1 stops before voiced C2

obstruents and in voiced C1 stops before voiceless C2 obstruents (a scenario usually

found in clusters with voice assimilation) but also in voiced C1 stops before voiced C2

obstruents. In addition, longer voicing during closure (VR greater than 50%) was

sometimes observed in voiceless C1 stops before voiceless C2 stops. Cases like these

suggest that variation in duration of voicing in C1 stops is probably caused by a long

cluster rather than by assimilation.

The question remains as to why variation in duration of voicing emerge in some

tokens and occur more often in fast speech. One reason is likely to be the prosodic

domain in which obstruent-sonorant-obstruent clusters occur. Recall that “sonorant

transparency” was claimed to occur across a boundary between a preposition and a

content word. The results of Experiment 2 indicate that part of variation in voicing

Page 182: Voicing and voice assimilation in Russian stops

162

162

duration of stops in prepositions should be attributed to the prosodic structure. Recall that

prepositions in Russian are prosodified under the Prosodic Word (Selkirk 1995), but they

are still separate lexical items.

Independent evidence from Russian (Surface) Palatalization (see Rubach 2000,

Gribanova 2008 for a phonological analysis of this process) shows that a boundary

between a preposition and a following word (noun or adjective) separates two lexical

domains. Palatalization before a front vowel /i/ is obligatorily word-internally, but it does

not occur across a word boundary. Similarly, palatalization before /i/ does not operate

across a boundary between a preposition and a content word.

Voicing processes across a clitic boundary in Russian operate in the same fashion

as do voicing processes word-internally (e.g. no final devoicing is found in final stops in

prepositions in presonorant position). The results of Experiment 2, however, suggest that

prosodic structure can affect the voicing in stops in prepositions, in general, not just in

obstruent-sonorant-obstruent clusters. Recall that more variation in duration of voicing

during closure was found in voiceless stops in prepositions than in word-internal stops. It

was not unusual to find underlying voiceless stops that were voiced for more than 50% of

their closure even in underlying voiceless stops in voiceless clusters. I suggest that the

effect of prosodic structure on voicing in Russian stops in prepositions results in

“sloppier” production of voicing. It reveals itself as more variation in voicing during

closure (see also Davidson and Roon (2008) for effects of prosodic structure on stop

closure duration in Russian). A similar effect of prosodic structure was observed in some

obstruent-sonorant-obstruent clusters, resulting in variation in voicing duration, but

longer and more marked sequence of segments in these clusters apparently presented

Page 183: Voicing and voice assimilation in Russian stops

163

163

additional difficulties to speakers25. Even more variation in voicing duration was found

in such clusters, but this variation is not the effect of C2.

Independent evidence of possible effects of prosody on results of phonological

processes is found in different languages. Zsiga (1995) found that coronal palatalization

of [s] before [j] in American English is incomplete when it occurs across a word

boundary. Snoeren et al (2006) report that voice assimilation is incomplete across a word

boundary in French.

Variation in voicing duration in stops prepositions was found more often in fast

speech. The reasons for more variable production of voicing in fast speech can be found

in articulatory implementation of voicing. Production of voicing during closure crucially

depends on the timing of the two events: the end of stop closure and the onset/offset of

vocal fold vibration. Controlling timing of articulatory gestures is harder in fast speech

than in slow speech; therefore, more “noise” in production is expected in fast speech.

It is also possible that relationship between alternations found in obstruent-

sonorant-obstruent clusters and speaking rate are a legitimate result of a trade-off

between the speaking rate and the complexity of syllable structure (see Chitoran and

Cohn 2009 for discussion). Chitoran and Cohn (2009) suggest that syllable complexity

correlates with speaking rate. Syllabic structure is preserved at slow rate and simplified at

fast rate. The results of this study support this hypothesis. Speakers used different repair

strategies (feature sharing, metathesis, deletion) to produce simpler syllable structures in

14% of obstruent-sonorant-obstruent clusters in the fast rate condition. Changes in

duration of voicing in such clusters, sometimes interpreted as a voice assimilation

25 Phrases with obstruent-sonorant-obstruent clusters were often hard to pronounce. In many

cases the speakers stumbled or stopped in the middle of the phrase and started again. They often

commented that they were tired when pronouncing these clusters.

Page 184: Voicing and voice assimilation in Russian stops

164

164

through a sonorant in Russian, are not even the most pervasive pattern. Nasalization and

sonorant devoicing occurred as often as changes in voicing duration.

Thus, speakers preserve the underlying contrast in stops in prepositions before

sonorants (both vowels and sonorant consonants) with some variation in voicing

duration; stops in obstruent clusters in prepositions show clear evidence of voice

assimilation. But stops in prepositions before /v/ followed by an obstruent in Russian

show a different pattern. Some speakers assimilate final stops in prepositions before a

voiced /v/ followed by a voiced stop, but other speakers do not assimilate stops in this

positions. They preserve the underlying contrast in stops in a fashion similar to the

position before a presonorant /v/. Yet no optionality was found in cases when stops

preceded /v/ followed by a voiceless stop. Fricative /v/ assimilated in voicing (devoiced)

and triggered voice assimilation in a preceding stop. These results are not compatible

with Hayes’s (1984) proposal that “sonorant transparency” and assimilation before /v/ in

Russian are governed by the same phonological rule of “voice assimilation through a

sonorant”26. Apparently, there is a clear difference in the two cases. Assimilation

(devoicing) is obligatory before devoiced /v/; assimilation in voicing is optional but still

categorical before a voiced /v/; but no assimilation is found in obstruent-sonorant-

obstruent clusters.

8.4. Conclusions

In this study I have examined the voicing properties of presonorant stops, as well

as cases of voice assimilation and final devoicing in different prosodic positions in

Russian. The results showed that (1) voicing during closure is the most important cue for

the laryngeal contrast in intervocalic stops in Russian and duration of voicing changes as

a function of speaking rate, that (2) phonological processes of voice assimilation and final

26 Hayes (1984) argues that Russian [v] is underlyingly a sonorant /w/.

Page 185: Voicing and voice assimilation in Russian stops

165

165

devoicing do not result in complete neutralization of the underlying voicing contrast, that

(3) “sonorant transparency” to voice assimilation is not “a phonological rule of fast

speech” as it was previously claimed in literature, and that (4) voice assimilation before

voiced /v/ is optional for some speakers.

The results provide additional evidence that speaking rate affects temporal cues in

voiced and voiceless stops differently. Voicing duration changes in different speaking

rate conditions only in stops specified with [voice]. No changes are found in voiceless

stops, assumed to be unspecified for voice. This effect is similar to the effect of rate on

VOT in initial stops, reported in earlier studies: VOT changes as a function of speaking

rate in stops specified with [voice] or [s.g.], but no change occurs in unspecified voiceless

unaspirated stops.

The results of this study provide support for previous claims that neutralization of

underlying voicing in final stops in some languages is incomplete. Speakers tend to

preserve underlying voicing contrasts in some cues (duration of voicing during closure,

closure duration, duration of a preceding vowel). This study provides evidence that

incomplete neutralization in vowel duration is also found in cases of voice assimilation.

Finally, the study shows that stops in prepositions can exhibit more variation in

voicing than word-internal stops. Voice assimilation is sometimes incomplete in

obstruent clusters. Voice assimilation before /v/ is optional; for some speakers. Stops in

obstruent-sonorant-obstruent clusters, in contrast, do not show evidence for voice

assimilation.

The results of the study raise questions about the assumption that phonology

precedes phonetics. An adequate account of the findings requires an interface that allows

interaction between the modules.

Page 186: Voicing and voice assimilation in Russian stops

166

166

REFERENCES

Allen, J. Sean and Joanne L. Miller. 1999. Effects of syllable-initial voicing and speaking rate on the temporal characteristics of monosyllabic words. Journal of the Acoustical Society of America 106, 2031-2039.

Alphen van, Petra and Roen Smits. 2004. Acoustical and perceptual analysis of the voicing distinction in Dutch initial plosives: The role of prevoicing. Journal of Phonetics 32, 455–491.

Anttila, Arto. 1997. Deriving variation from grammar. In Frans Hinskens, Roeland van Hout and Leo Wetzels (eds.) Variation, Change and Phonological Theory. 35-68. Amsterdam: John Benjamins.

Avanesov, Ruben I. 1968. Russkoe Literaturnoe Proiznoshenie. Moscow: Prosveshchenie.

Avery, Peter and William Idsardi. 2001. Laryngeal dimensions, completion and enhancement. In T. Alan Hall (ed.) Distinctive Feature Theory, 41–70. Berlin: de Gruyter.

Baranovskaja, S.A. 1968. Pozicionnoe vlijanie na var’irovanie soglasnyx po gluxosti-zvonkosti v sovremennom russkom literaturnom jazyke. Trudy Universiteta Družby Narodov im. Patrisa Lumumby 29: 24-38.

Barry, Martin C. 1985. A palatographic study of connected speech processes. In Cambridge Papers in Phonetics and Experimental Linguistics, vol. 4, 1-16. Department of Linguistics, University of Cambridge.

Barry, Susan M. E. 1988. Temporal aspects of the devoicing of word-final obstruents in Russian. In J. N. Holmes and W. A. Ainsworth (eds.) Speech’88 Proceedings of the Federation of Acoustical Societies of Europe, August 1988. 81-88. Edinburgh: Institute of Acoustics.

Barry, Susan M. E. 1995. Variation in vocal fold vibration during voiced obstruents in Russian. European Journal of Disorders of Communications 30, 124-131.

Beckman, Jill, Michael Jessen, and Catherine Ringen. 2009. German fricatives: Coda devoicing or positional faithfulness? Phonology 26, 231–268.

Beckman, Jill, Pétur Helgason, Bob McMurray, and Catherine Ringen. 2011. Rate effects on Swedish VOT: Evidence for phonological overspecification. Journal of Phonetics 39, 39-49.

Beckman, Jill, Michael Jessen, and Catherine Ringen. (in press). Empirical evidence for laryngeal features: German vs. true voice languages. Journal of Linguistics.

Benkí, José R. 2005. Perception of VOT and first formant onset by Spanish and English speakers. In James Cohen, Kara T. McAlister, Kellie Rolstad, and Jeff MacSwan (eds.) Proceedings of the 4th International Symposium on Bilingualism. 240-248. Somerville, MA: Cascadilla Press.

Page 187: Voicing and voice assimilation in Russian stops

167

167

Boersma, Paul. 2005. Some listener-oriented accounts of hache aspiré in French. MS, University of Amsterdam. Rutgers Optimality Archive 730. http://roa.rutgers.edu.

Boersma, Paul. 2006. Prototypicality judgments as inverted perception. In Gisbert Faneslow, Caroline Fery, Ralph Vogel, and Matthias Schlesewsky (eds.) Gradience in Grammar: Generative Perspectives. 167-184. Oxford: OUP.

Boersma, Paul and David Weenink 2011. Praat: doing phonetics by computer [Computer program]. Version 5.2.15, retrieved 11 February 2011 from http://www.praat.org/

Burton, Martha W. and Karen E. Robblee. 1999. A phonetic analysis of voicing assimilation in Russian. Journal of Phonetics 25, 97-114.

Byrd, Dani. 1996. Influences on articulatory timing in consonant sequences. Journal of Phonetics 24, 209-244.

Caramazza, A. and G. H. Yeni-Komshian. 1974. Voice onset time in two French dialects. Journal of Phonetics 2, 239-245.

Campos-Astorkiza, Rebeka. 2006. Lithuanian contrastive vowel length and the voicing effect: The role of minimal contrast. Paper presented at the 14th Manchester Phonology meeting.

Chen, Matthew. 1970. Vowel length variation as a function of the voicing of consonant environment. Phonetica 22, 129–159.

Chang, Charles B. 2012. Rapid and multifaceted effects of second-language learning on first-language speech production. Journal of Phonetics 40, 249-268.

Charles-Luce, James and Dinnsen, Daniel. 1987. A reanalysis of Catalan devoicing. Journal of Phonetics 15, 187-190.

Chitoran, Ioana and Cohn, Abigail C. 2009. Complexity in phonetics and phonology, gradience, categoriality, and naturalness. In F. Pellegrino, E. Marsico, I. Chitoran & C. Coupé (eds.) Approaches to phonological complexity, Phonology & Phonetics Series. 21-46. Berlin: Mouton de Gruyter.

Cho, Young-mee Yu. 1990. “Parameters of consonantal assimilation.” PhD. diss. Stanford University.

Cho, Taehong and Peter Ladefoged. 1999. Variation and universals in VOT: Evidence from 18 languages. Journal of Phonetics 27, 207-229.

Cho, Taehong, Sun-Ah Jun, and Peter Ladefoged. 2002. Acoustic and aerodynamic correlates of Korean stops and fricatives. Journal of Phonetics 30, 193-228.

Chomsky, Noam and Morris Halle. 1968. The Sound Pattern of English. New York: Harper & Row.

Cohn, Abigail C. and Catherine Lockwood. 1994. A phonetic description of Madurese and its phonological consequences. Working Papers of the Cornell Phonetics Laboratory 9, 67–92.

Page 188: Voicing and voice assimilation in Russian stops

168

168

Cole, Jennifer, Gary Linebaugh, Cheyenne Munson, and Bob McMurray. 2010. Unmasking the acoustic effects of vowel-to-vowel coarticulation: A statistical modeling approach. Journal of Phonetics 38, 167-184.

Cooper, William E. 1974. Contingent feature analysis in speech perception. Perception and Psychophysics 16, 229-234.

Davidson, Lisa and Kevin Roon 2008. Durational correlates for differentiating consonant sequences in Russian. Journal of the International Phonetic Association 38, 137-165.

Dell, Gary S. 1986. A spreading activation theory of retrieval in sentence production. Psychological Review 93, 283-321.

Dell, Gary S. and Padraig G. O’Seaghdha. 1991. Mediated and convergent lexical priming in language production: A comment on Levelt et al. (1991). Psychological Review 98, 604–614.

Dinnsen, Daniel and James Charles-Luce. 1984. Phonological neutralization, phonetic implementation and individual differences. Journal of Phonetics 12, 49-60.

Dmitrieva, Olga, Allard Jongman, and Joan Sereno. 2010. Phonological neutralization by native and non-native speakers: The case of Russian final devoicing. Journal of Phonetics 38, 483-492.

Docherty, Gerard J. 1992. The timing of voicing in British English obstruents. Berlin: Foris.

Eimas, Peter D., Einar R. Siqueland, Peter Jusczyk, and James Vigorito. 1971. Speech perception in infants. Science 171, 303-306.

Eimas, Peter D. and John D. Corbit. 1973. Selective adaptation of linguistic feature detectors, Cognitive Psychology 4, 99-100.

Es’kova, N.A. 1971. K voprosu o svoistvax sonornyx soglasnyx v russkom jazyke. In S.S. Vysotskij (ed.) Razvitie fonetiki sovremennogo russkogo jazyka: Fonologičeskie podsistemy. 243-247. Moscow: Nauka.

Fintoft, Knut. 1961. The duration of some Norwegian speech sounds. Phonetica 7, 19-39.

Flege, James E. 1991. Age of learning affects the authenticity of voice-onset time VOT in stop consonants produced in a second language. Journal of Acoustical Society of America 89, 395-411.

Flege, James E. and Wieke Eefting. 1986. Linguistics and developmental effects on the production and perception of stop consonants. Phonetica 43, 155–171.

Fourakis, Marios and Gregory K. Iverson. 1984. On the incomplete neutralization of German final obstruents. Phonetica 41, 140-149.

Fowler, Carol A., Valery Sramko, David J. Ostrya, Sarah A. Rowland, Pierre Halle. 2008. Cross language phonetic influences on the speech of French-English bilinguals. Journal of Phonetics 36, 649-663.

Page 189: Voicing and voice assimilation in Russian stops

169

169

Goldrick, Matthew and Sheila E. Blumstein. 2006. Cascading activation from phonological planning to articulatory processes: Evidence from tongue twisters. Language and Cognitive Processes 21, 649-683.

Gósy, Maria. 2001. The VOT of the Hungarian voiceless plosives in words and in spontaneous speech. International Journal of Speech Technology 4, 75-85.

Gow, David W. 2003. Feature parsing: Feature cue mapping in spoken word recognition. Perception and Psycholinguistics 64, 575-590.

Gribanova, Vera. 2008. Russian Prefixes, prepositions and palatalization in Stratal OT. In Charles B. Chang and Hannah J. Haynie (eds.) Proceedings of the 26th West Coast Conference on Formal Linguistics. 217-225. Somerville, MA: Cascadilla Proceedings Project.

Haggard, Mark, Stephen Ambler, and Mo Callow. 1970. Pitch as a voicing cue. Journal of the Acoustical Society of America 47, 613–617.

Halle, Morris. 1959. The sound pattern of Russian: A linguistic and acoustical investigation. The Hague & Paris: Mouton & Co.

Halle, Morris and Kenneth N. Stevens. 1971. A note on laryngeal features. Quarterly progress report 101. 198-212. Cambridge, MA: Research Laboratory of Electronics.

Hazan, Valerie L. and Georges Boulakia. 1993. Perception and production of a voicing contrast by French-English bilinguals. Language and Speech 36, 17–39.

Hayes, Bruce. 1984. The phonetics and phonology of Russian voicing assimilation. In M. Aronoff and R.T. Oehrle (eds.) Language. Sound. Structure. 318-328. Cambridge, MA: The MIT Press.

Helagson, Pétur and Catherine Ringen. 2008. Voicing and aspiration in Swedish stops. Journal of Phonetics 36, 607–628.

Hombert, Jean-Marie, John J. Ohala, and William G. Ewan. 1979. Phonetic explanations for the development of tones. Language 55, 37-58.

Honeybone, Patrick. 2005. Diachronic evidence in segmental phonology: the case of obstruent laryngeal specifications. In Marc van Oostendorp and Jeroen van de Weijer (eds.) The Internal Organization of Phonological Segments. 319-354. Berlin: Mouton de Gruyter.

House, Arthur S. and Grant Fairbanks. 1953. The influence of consonant environment upon the secondary acoustical characteristics of vowels. Journal of Acoustical Society of America 25, 105–113.

Iverson, Gregory K. and Joseph C. Salmons. 1995. Aspiration and laryngeal representation in Germanic. Phonology 12, 369-396.

Iverson, Gregory K. and Joseph C. Salmons. 2011. Final devoicing and final laryngeal neutralization. In Marc van Oostendorp, Colin J. Ewen, Elizabeth Hume and Keren Rice (eds.) The Blackwell Companion to Phonology. 1622-1643. Wiley-Blackwell.

Page 190: Voicing and voice assimilation in Russian stops

170

170

Jakobson, Roman. 1956. Die Verteilung der stimmhaften und stimmlosen Geräuschlaute im Russischen. 199-202. Festschrift für Max Vasmer, Berlin.

Jakobson, Roman. 1968. K voprosu o gluxosti i zvonkosti russkix schelinnyx gubnyx. Slavia Orientalis 17, 321-324.

Jakobson, Roman. 1978. Mutual assimilation of Russian voiced and voiceless consonants, Studia Linguistica 32, 107-110.

Jassem, Wiktor and Lutosława Richter. 1987. Neutralization of voicing in Polish obstruents. Journal of Phonetics 17, 317-325.

Jessen, Michael. 1998. Phonetics and phonology of tense and lax obstruents in German. Amsterdam: Benjamins.

Jessen, Michael and Catherine Ringen. 2002. Laryngeal features in German. Phonology 19, 189-218.

Jongman, Allard, Joan A. Sereno, Marianne Raaijmakers, and Aditi Lahiri. 1992. The phonological representation of [voice] in speech perception. Language and Speech 35, 137-152.

Kalenčuk, Maria L. and Rosalia F. Kasatkina. 1999. Osobennosti zvukovogo oformlenija russkix pristavok. Russian Linguistics 23, 1-9.

Kavitskaja, Darya. 1999. Voicing assimilation and schizophrenic behavior of /v/ in Russian. In Herbert Coats, Katarzina Dzivirek, and Cynthia M. Vakareliyska (eds.) Workshop on formal approaches to Slavic linguistics. 225-244. Seattle, Wash.

Keating, Patricia A. 1984. Phonetic and phonological representation of stop consonant voicing. Language 60, 286-319.

Keating, Patricia A. 1985. Universal phonetics and the organization of grammars. In. V.A. Fromkin (ed.) Phonetic Linguistics. 115-132. New York: Academic Press.

Kessinger, Rachel H. and Sheila E. Blumstein. 1997. Effects of speaking rate on voice-onset time in Thai, French, and English. Journal of Phonetics 25, 143-168.

Kim, Chin-Wu. 1970. A theory of aspiration. Phonetica 21, 107-116.

Kingston, John. 2005. Ears to categories: New arguments for autonomy. In Sónia Frota, Marina Vigário and Maria João Freitas (eds.) Prosodies: With special reference to Iberian language. 177-222. Berlin: Mouton de Gruyter.

Kingston, John and Randy L. Diehl. 1994. Phonetic knowledge. Language 70, 419–454.

Kiparsky, Paul. 1985. Some consequences of Lexical Phonology. Phonology Yearbook 2, 85-138.

Klatt, Dennis H. 1975. Voice onset time, frication, and aspiration in word initial consonant clusters. Journal of Speech and. Hearing Research 18, 686-705.

Kn’azev, S.V. 2004. Ob ierarxii fonologičeskix pravil v russkom jazyke. In Semiotika, lingvistika, poetika. 133-150. Moscow: Studia Philologica.

Page 191: Voicing and voice assimilation in Russian stops

171

171

Kn’azev, S.V. 2006. Struktura fonetičeskogo slova v russkom jazyke: Sinxronija i diaxronija. Moscow: Izd-vo Maks-Press.

Kong, Eun Jong. 2009. “The development of phonation-type contrasts in plosives: Cross-linguistic perspectives.” PhD diss., The Ohio State University.

Kulikov, Vladimir. 2010. Voicing assimilation in fast speech in Russian. Paper presented at the 5th Annual Slavic Linguistic Society Conference, Chicago, IL.

Labov, William. 1969. Contraction, deletion, and inherent variability of the English copula. Language 45, 715–762.

Ladefoged, Peter. 1967. There areas of experimental phonetics. London: O.U.P.

Ladefoged, Peter and Ian Maddieson. 1996. The sounds of the world’s languages. Blackwell Publishing.

Laeufer, Christiane. 1992. Patterns of voicing-conditioned vowel duration in French and English. Journal of Phonetics 20, 411–440.

Liberman, A.M., P.C. Delattre, and F.S. Cooper. 1958. Some cues for the distinction between voiced and voiceless stops in initial position. Language and Speech 1, 153-167.

Lisker, Leigh. 1975. Is it VOT or a first-formant transition detector? Journal of the Acoustical Society of America 57, 1547-1551.

Lisker, Leigh. 2003. On perceiving certain voiceless unaspirated stops. Proceedings of the 15

th International Congress of Phonetic Sciences. Barcelona, Spain. Aug.3-9,

821-823.

Lisker, Leigh and Arthur S. Abramson. 1964. A cross-language study of voicing in initial stops: acoustical measurements. Word 20, 384-422.

Lisker, Leigh and Arthur S. Abramson. 1967. Some effects of context on voice onset time in English stops. Language and Speech 10, 1-28.

Lihtman, R.I. 1980. Konečnye zvonkie soglasnye na styke slov. Filologičeskie nauki 1, 52-58.

Lombardi, Linda. 1991. Laryngeal Features and Laryngeal Neutralization, Ph.D. diss., University of Massachusetts, Amherst. Published in 1994 by Garland, New York.

Lombardi, Linda. 1995. Laryngeal features and privativity. The Linguistic Review 12, 35-59.

Lombardi, Linda. 1999. Positional Faithfulness and voicing assimilation in Optimality Theory. Natural Language and Linguistic Theory 17, 267-302.

Magloire, Joёl and Kerry P. Green. 1999. A cross-language comparison of speaking rate effects on the production of voice onset time in English and Spanish. Phonetica 56, 158-185.

Page 192: Voicing and voice assimilation in Russian stops

172

172

Matsui, Mayuki. 2011. The identifiability and discriminability between incompletely neutralized sounds: Evidence from Russian. Paper presented at the 17th International Congress of Phonetic Sciences ICPhS XVII, Hong Kong, China.

McClelland, James L. and Jeffrey L. Elman. 1986. The TRACE model of speech perception. Cognitive Psychology 18, 1-86.

McMurray, Bob, and Allard Jongman. 2011. What information is necessary for speech categorization? Harnessing variability in the speech signal by integrating cues computed relative to expectations. Psychological Review 118, 219–246.

McMurray, Bob, Jennifer S. Cole, Cheyenne Munson. 2010. Features as an emergent product of perceptual parsing: Evidence from vowel-to-vowel coarticulation. In N. Clements and R. Ridouane (eds.) Where do Features Come From? The nature and sources of phonological primitives. Elsevier: North-Holland Linguistic Series.

McQueen, James M. and Anne Cutler. 1997. Cognitive processes in speech perception. In William J. Hardcastle and John Laver (eds.) The Handbook of Phonetic Sciences. 566-585. Oxford: Blackwell.

Mester, R. Armin and Junko Itô. 1989. Feature predictability and underspecification: Palatal prosody in Japanese mimetics. Language 65, 258-293.

Nielsen, Kuniko. 2011. Specificity and abstractness of VOT imitation. Journal of Phonetics 39, 132–142.

Ohala, John J. 1972. How is pitch lowered? Journal of Acoustic Society of America 15, 124.

Ohde, Ralph N. 1984. Fundamental frequency as an acoustic correlate of stop consonant voicing. Journal of Acoustic Society of America 75, 224-230.

Padgett, Jaye. 2002. Russian voicing assimilation, final devoicing, and the problem of [v]. Ms., University of California Santa Cruz. ROA #528.

Panov, Mikhail V. 1967. Russkaja fonetika. Moscow: Prosveshchenie.

Paufošima, R. F. and D. A. Agaronov. 1971. Ob uslovijax assimil’ativnogo ozvončenija soglasnyx na styke fonetičeskix slov v russkom jazyke. In S. Vysotskij (ed.). Razvitie fonetiki sovremennogo russkogo jazyka: Fonologičeskie podsistemy. 189-199. Moscow: Nauka.

Peterson, Gordon E. and Ilse Lehiste. 1960. Duration of syllable nuclei in English. Journal of the Acoustical Society of America 32, 693-703.

Petrova, Olga. 2003. Sonorants and the labiodental continuant /v/ in Russian: An OT analysis. In W. Browne, Ji-Yung Kim, B.H. Partee, and R. Rothstein (eds.) Annual Workshop on Formal Approaches to Slavic Linguistics, The Amherst Meeting 2002, 413–432.

Pind, Jörgen. 1999. The role of F1 in the perception of voice onset time and voice offset time. Journal of the Acoustical Society of America 106, 434–437.

Page 193: Voicing and voice assimilation in Russian stops

173

173

Pye, Susan. 1986. Word final devoicing of obstruents in Russian. In Cambridge Papers in Phonetics and Experimental Linguistics 5, 1-10.

Reformatskii, Aleksandr A. 1975. Fonologičeskiie et’udy. Moscow: Prosveshchenie.

Ringen, Catherine and Vladimir Kulikov. (In press). Voicing in Russian stops: Cross-linguistic implications. Journal of Slavic Linguistics.

Ringen, Catherine and Kari Suomi. 2012. The voicing contrast in Fenno-Swedish. Journal of Phonetics 40, 419-429.

Robblee, Karen E. and Martha W. Burton. 1997. Sonorant voicing transparency in Russian. Annual Workshop on Formal Approaches to Slavic Linguistics. The Cornell meeting 1995. 407-434.

Rubach, Jerzy. 1996. Nonsyllabic analysis of voice assimilation in Polish. Linguistic Inquiry 27. 69-110.

Rubach, Jerzy. 1997. Polish voice assimilation in Optimality Theory. Rivista di linguistica 9, 291-342.

Rubach, Jerzy. 2000. Backness switch in Russian. Phonology 25, 39-64.

Rubach, Jerzy. 2008. Prevocalic Faithfulness. Phonology 25, 433-468.

Sancier, Michele L. and Carol A. Fowler. 1997. Gestural drift in a bilingual speaker of Brazilian Portuguese and English. Journal of Phonetics 25, 421-436.

Selkirk, Elizabeth. 1995. The prosodic structure of function words. In Jill Beckman, Laura Walsh Dickey, and Suzanne Urbanczyk (eds.) Papers in Optimality Theory. 439-470. Amherst, MA: GLSA Publications.

Shapiro, Michael. 1993. Russian non-distinctive voicing: A stocktaking. Russian Linguistics 17, 1-14.

Sharf, Donald J. 1962. Duration of post-stress intervocalic stops and preceding vowels. Language and Speech 5, 26-30.

Shrager, Miriam. 2006. Neutralization of word-final voicing in Russian. Paper presented at the 1

st meeting of SLS, Indiana University.

Slis, Imah H. 1986. Assimilation of voice in Dutch as a function of stress, word boundaries, and sex of speaker and listener. Journal of Phonetics 14, 311-326.

Slowiaczek, Louisa M. and Daniel A. Dinnsen. 1985. On the neutralization status of Polish word final devoicing. Journal of Phonetics 13, 325-341.

Snoeren, Natalie D., Pierre A. Hallé, Juan Segui. 2006. A voice for the voiceless: Production and perception of assimilated stops in French. Journal of Phonetics 34, 241–268.

Solé, Maria-Josep and Eva Estebas. 2000. Phonetic and Phonological phenomena: V.O.T. A cross-language comparison. Proceedings of the XVII AEDEAN Conference. Vigo, Spain, 437–444.

Page 194: Voicing and voice assimilation in Russian stops

174

174

Steriade, Donca 1999. Phonetics in phonology: The case of laryngeal neutralization. UCLA Working Papers in Linguistics 2: Papers in Phonology 3, 25–146.

Stevens, Kenneth N. and D. H. Klatt. 1974. Role of formant transitions in the voiced-voiceless distinction for stops. Journal of Acoustic Society of America 55, 653-659.

Strycharczuk, Patrycja. 2010a. Word-level effects in Polish laryngeal neutralisation. Paper presented at CUNY Phonology Forum Conference on the Word, January 2010.

Strycharczuk, Patrycja. 2010b. What's in a word? Prosody in Polish voicing. Paper presented at the 18th Manchester Phonology Meeting, May 2010.

Summerfield, Quentin. 1974. Toward a detailed model for the perception of voicing contrast. Speech Perception 3, 1-26.

Summerfield, Quentin. 1981. Articulatory rate and perceptual constancy in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance 7, 1074-1095.

Summerfield, Quentin and Mark Haggard. 1977. On the dissociation of spectral and temporal cues to the voicing distinction in initial stop consonants. Journal of the Acoustical Society of America 62, 435-448.

Ševoroškin, V.V. 1971. O dvux [v] v russkom jazyke. In S.S. Vysotskij (ed.). Razvitie fonetiki sovremennogo russkogo jazyka: Fonologičeskie podsistemy. 279-286. Moscow: Nauka.

Toscano, Joseph C. and Bob McMurray. 2010. Cue integration with categories: Weighting acoustic cues in speech using unsupervised learning and distributional statistics. Cognitive Science 34, 434–464.

Toscano, Joseph C. and Bob McMurray. 2012. Cue-integration and context effects in speech: Evidence against speaking-rate normalization. Attention, Perception & Psychophysics.

Trubetzkoy, Nikolai. 1969. Principles of Phonology (translated by Ch. Baltaxe). Berkeley, CA: University of California Press.

Ussishkin, Adam and Andrew Wedel. 2009. Lexical access, effective contrast, and patterns in the lexicon. In Paul Boersma and Silke Hamann (eds.) Phonology in Perception. 267–292. Berlin, New York: Mouton de Gruyter.

Warner, Natasha, Allard Jongman, Joan Sereno, Rachèl Kemps. 2004. Incomplete neutralization and other sub-phonemic durational differences in production and perception: Evidence from Dutch. Journal of Phonetics 32, 251-276.

Warner, Natasha, Erin Good, Allard Jongman, Joan Sereno. 2006. Orthographic vs. morphological incomplete neutralization effects. Journal of Phonetics 34, 285-293.

Wedel, Andrew. 2002. Phonological alternation, lexical neighborhood density and markedness in processing. Paper presented at the Eighth Laboratory Phonology conference.

Page 195: Voicing and voice assimilation in Russian stops

175

175

Westbury, John R. 1983. Enlargement of the supraglottal cavity and its relation to stop consonant voicing. Journal of Acoustic Society of America 73, 1322-1336.

Williams, Lee. 1977a. The voicing contrast in Spanish. Journal of Phonetics 5, 169–184.

Williams, Lee. 1977b. The perception of stop consonant voicing by Spanish-English bilinguals. Perception and Psychophysics 21, 289-297.

Zimmerman, S.A. and S.M. Sapon. 1958. Note on vowel duration seen crosslinguistically. Journal of Acoustic Society of America 30, 152-153.

Zsiga, Elizabeth C. 1994. Acoustic evidence for gestural overlap in consonant sequences. Journal of Phonetics 22, 121-140.

Zsiga, Elizabeth C. 1995. An acoustic and electropalatographic study of lexical and post-lexical palatalization in American English. In B. Connell and A. Arvaniti (eds.) Phonology and Phonetic Evidence: Papers in Laboratory Phonology IV. 282-302. Cambridge: Cambridge University Press.

Zsiga, Elizabeth C. 2000. Phonetic alignment constraints: consonant overlap and palatalization in English and Russian. Journal of Phonetics 28, 69-102.

Page 196: Voicing and voice assimilation in Russian stops

176

176

APPENDIX A

LIST OF STIMULI

1. Experiment 1. Word-internal stops parus ‘sail’ barxat ‘velvet’ talyj ‘thawed’ darom ‘for free’ kapal ‘dropped’ galok ‘magpies’ Gen.pl. braga ‘brew’ pravy ‘right’ drama ‘drama’ travy ‘grasses’ kraby ‘crabs’ graby ‘hornbeam’ napor ‘push’ nabor ‘set’ motor ‘engine’ zador ‘zeal’ zakon ‘law’

bagor ‘hook’ vepr’a ‘boar’ Gen.sg. zebra ‘zebra’ teatra ‘theater’ Gen.sg. kadra ‘frame’ Gen.sg. fiakra ‘fiacre’ Gen.sg. onagra ‘onagri’ Gen.sg. katka ‘rolller’ Gen.sg. sadka ‘cage’ Gen.sg. molot

jba threshing’

gorodjba ‘fencing’

sirop ‘syrup’ sugrob ‘snowdrift’ vorot ‘gates’ narod ‘people’ syrok ‘cheese’ nalog ‘tax’

2. Experiment 2. Stops in prepositional clusters ot ‘from’ rama ‘frame’ nad ‘over’ mama ‘Mom’

par ‘steam’ karta ‘map’ bak ‘tank’ gaz ‘gas’

3. Experiment 3. Stops in obstruent-sonorant-obstruent clusters ot ‘from’ mox ‘moss’ (mxom, mxa in PPs) nad ‘over’ rtut’ ‘mercury’

mzda ‘bribe’ lgun’ja ‘liar’ fem. par ‘steam’ karta ‘map’ bak ‘tank’ gaz ‘gas’

4. Experiment 4. Assimilation before /v/ ot ‘from’ vtornik ‘Tuesday’ nad ‘over’ vdovy ‘widows’

Volga ‘the Volga river’

Page 197: Voicing and voice assimilation in Russian stops

177

177

5. Experiment 5. Voicing across a word boundary

grip dolečili ‘flu has been cured’ grip t’ažolyj ‘flu is bad’ grip opasen ‘flu is dangerous’ grib dočistili ‘the mushroom has been peeled grib požarili ‘the mushroom has been fried’ grib opasnyj ‘the mushroom is poisonous’ kot belyj ‘the cat is white’ kot pokormlen ‘the cat is fed’ kot otmyt ‘the cat is washed’ kod bezopasnyj ‘the code is safe’ kod podobran ‘the code is deciphered’ kod otkryt ‘the code is open’ luk dožarili ‘onion has been fried’ luk požarili ‘onion has been fried’ luk očistili ‘onion was peeled’ lug dokosili ‘the lawn has been mown’ lug pokosili ‘the lawn was mown’ lug ogorodili ‘the lawn has been enclosed’

Page 198: Voicing and voice assimilation in Russian stops

178

178

APPENDIX B

RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 1

Table B1. Mean VOT and standard deviations (in brackets) for voiced and voiceless utterance initial stops (list reading).

Place Sonorant Voiceless Voiced

Bilabial Vowel 16.8 (6) -99.6 (23)

Consonant 16.3 (5) -75.1 (17)

Total 16.6 (5) -87.4 (24)

Coronal Vowel 18.7 (7) -105.3 (19)

Consonant 20.9 (5) -82.6 (18)

Total 19.8 (6) -94.0 (22)

Velar Vowel 34.3 (9) -80.8 (30)

Consonant 35.3 (8) -72.1 (25)

Total 32.3 (9) -76.4 (28)

Total 23.7 (10) -85.9 (25)

Table B2. Means (ms) and standard deviations (in brackets) for VOT of word-initial (sentence-medial) voiceless stops at three places of articulation before a vowel and a consonant in slow and fast conditions.

Place Sonorant Slow Fast

Bilabial Vowel 15.4 (4) 15.5 (5)

Consonant 15.8 (4) 13.9 (2)

15.6 (4) 14.7 (4)

Coronal Vowel 17.7 (5) 18.2 (6)

Consonant 20.1 (6) 20.1 (5)

18.9 (6) 19.1 (6)

Velar Vowel 31.3 (6) 31.0 (5)

Consonant 33.3 (5) 30.3 (6)

32.3 (6) 30.6 (5)

Total 22.3 (9) 21.5 (8)

Page 199: Voicing and voice assimilation in Russian stops

179

179

Table B3. Means (ms) and standard deviations (in brackets) for voicing of word-initial (sentence-medial) voiced stops at three places of articulation before a vowel and a consonant.

Slow Fast

Place Sonorant Mean (SD) % of stops Mean (SD) % of stops

with unbroken

voice

with unbroken

voice

Bilabial Vowel 101 (16) 100% 79.1 (15) 96.4%

Consonant 91.4 (12) 100% 72.1 (14) 96.4%

96.2 (15) 100% 75.6 (15) 96.4%

Coronal Vowel 95.5 (25) 100% 65.2 (13) 96.4%

Consonant 78.0 (15) 100% 63.4 (14) 92.8%

86.7 (22) 100% 64.3 (13) 94.6%

Velar Vowel 86.1 (17) 100% 62.7 (13) 92.8%

Consonant 66.1 (22) 92.8% 52.8 (14) 92.8%

76.1 (22) 96.4% 57.8 (14) 92.8%

Total 86.3 (21) 98.8% 65.9 (16) 94.6%

Table B4. Means (Hz) and standard deviations (in brackets) for f0 and F1 after voiced and voiceless stops at three places of articulation, pooled across rates and sonorant types.

Cue Means (SD)

Place Voiced Voiceless

f0 Bilabial 179 (61) 185 (64)

Coronal 181 (61) 187 (62)

Velar 181 (62) 189 (65)

179 (61) 187 (64)

F1 Bilabial 491 (72) 511 (82)

Coronal 488 (76) 552 (109)

Velar 450 (67) 533 (104)

476 (74) 532 (100)

Page 200: Voicing and voice assimilation in Russian stops

180

180

Table B5. Means (ms) and standard deviations (in brackets) for voicing during closure of word-medial voiceless stops at three places of articulation before a vowel and a consonant.

Place Sonorant List Slow Fast

Bilabial Vowel 19.8 (11) 16.6 (10) 18.3 (13)

Consonant 16.9 (8) 17.0 (6) 16.0 (7)

18.4 (10) 16.8 (8) 17.2 (11)

Coronal Vowel 15.0 (8) 14.8 (8) 15.5 (10)

Consonant 14.1 (7) 12.5 (6) 13.2 (8)

14.5 (7) 13.7 (7) 14.4 (9)

Velar Vowel 5.8 (5) 6.6 (5) 9.1 (5)

Consonant 9.1 (6) 8.0 (6) 9.6 (6)

7.5 (6) 7.3 (6) 9.3 (6)

Total Vowel 13.5 (10) 12.7 (9) 14.3 (11)

Consonant 13.3 (8) 12.5 (7) 12.9 (7)

13.4 (9) 12.6 (8) 13.6 (9)

Table B6. Means (ms) and standard deviations (in brackets) for closure duration of word-medial voiceless stops at three places of articulation before a vowel and a consonant.

Place Sonorant List Slow Fast

Bilabial Vowel 136.6 (16) 120.6 (16) 87.6 (12)

Consonant 98.2 (18) 88.5 (15) 65.7 (13)

117.7 (26) 104.6 (23) 76.6 (17)

Coronal Vowel 128.8 (20) 108.1 (18) 69.3 (12)

Consonant 85.7 (15) 77.6 (14) 49.9 (15)

107.2 (28) 92.8 (22) 59.6 (17)

Velar Vowel 108.8 (19) 98.4 (15) 67.5 (13)

Consonant 78.7 (11) 71.7 (7) 53.3 (10)

93.7 (22) 85.0 (18) 60.4 (14)

Total 106.2 (27) 94.1 (22) 65.5 (17)

Page 201: Voicing and voice assimilation in Russian stops

181

181

Table B7. Means (ms) and standard deviations (in brackets) for closure duration of word-medial voiced stops at three places of articulation before a vowel and a consonant.

Place Sonorant List Slow Fast

Bilabial Vowel 108.6 (16) 97.3 (12) 68.3 (10)

Consonant 71.0 (13) 66.4 (13) 53.8 (11)

89.8 (24) 81.8 (20) 61.1 (13)

Coronal Vowel 97.7 (21) 87.4 (17) 58.0 (15)

Consonant 62.7 (11) 55.2 (13) 40.9 (12)

80.2 (23) 71.3 (22) 49.5 (16)

Velar Vowel 90.7 (21) 74.8 (15) 56.8 (8)

Consonant 62.0 (14) 58.2 (10) 49.6 (9)

76.3 (23) 66.5 (15) 53.2 (9)

Total 82.1 (24) 73.2 (20) 54.6 (14)

Table B8. Means (ms) and standard deviations (in brackets) for voicing during closure of word-medial voiced stops at three places of articulation before a vowel and a consonant.

Place Sonorant List Slow Fast

Bilabial Vowel 108.6 (16) 97.0 (12) 68.3 (10)

Consonant 69.7 (14) 65.2 (16) 53.2 (11)

89.2 (25) 81.1 (22) 60.8 (15)

Coronal Vowel 97.0 (21) 86.6 (17) 56.9 (15)

Consonant 61.2 (12) 52.0 (15) 40.5 (12)

79.1 (25) 69.3 (23) 48.7 (16)

Velar Vowel 85.2 (20) 74.8 (15) 55.8 (9)

Consonant 59.6 (14) 57.2 (11) 46.0 (13)

72.4 (21) 66.0 (16) 50.9 (12)

Total Vowel 96.9 (21) 86.1 (17) 60.3 (13)

Consonant 63.5 (14) 58.2 (15) 46.6 (13)

80.2 (25) 72.1 (21) 53.5 (15)

Page 202: Voicing and voice assimilation in Russian stops

182

182

Table B9. Mean VR and percent of fully voiced word-medial voiced stops at three places of articulation before a vowel and a consonant.

Mean VR % of fully voiced tokens

Place Sonorant List Slow Fast List Slow Fast

Bilabial Vowel 100% 99.7% 100% 100% 96.3% 100%

Consonant 98.1% 97.4% 98.9% 89.3% 96.3% 92.9%

99.1% 98.6% 99.4% 94.6% 96.4% 96.4%

Coronal Vowel 99.3% 99.3% 98.2% 92.9% 92.9% 89.3%

Consonant 97.8% 94.3% 99.2% 92.9% 82.1% 96.3%

98.6% 96.8% 98.7% 92.9% 87.5% 92.9%

Velar Vowel 95.1% 100% 98.3% 85.7% 100% 92.9%

Consonant 96.5% 98.4% 92.5% 89.3% 92.9% 82.1%

95.8% 99.2% 95.4% 87.5% 98.2% 87.5%

Total Vowel 98.1% 99.7% 98.8% 92.7% 96.4% 94.0%

Consonant 97.5% 96.7% 96.9% 90.5% 90.5% 90.5%

97.8% 98.2% 97.9% 91.7% 93.5% 92.3%

Table B10. Mean VR of word-medial voiceless stops at three places of articulation before a vowel and a consonant.

Place Sonorant List Slow Fast

Bilabial Vowel 14.8% 13.9% 22.2%

Consonant 18.1% 19.9% 24.7%

16.4% 16.9% 23.5%

Coronal Vowel 12.2% 14.0% 23.2%

Consonant 16.7% 16.4% 28.5%

14.5% 15.2% 25.9%

Velar Vowel 5.5% 6.6% 14.1%

Consonant 11.9% 11.3% 17.9%

8.7% 8.9% 16.0%

Total Vowel 10.8% 11.5% 19.8%

Consonant 15.5% 15.9% 23.7%

13.2% 13.7% 21.8%

Page 203: Voicing and voice assimilation in Russian stops

183

183

Table B11. Means (Hz) and standard deviations (in brackets) for f0 and F1 before (‘pre’) and after (‘post’) word-medial voiced and voiceless stops at three places of articulation.

Cue Means (SD)

Place Voiced Voiceless

f0_pre Bilabial 184.8 (67) 183.7 (67)

Coronal 181.7 (67) 183.3 (68)

Velar 181.7 (68) 192.1 (75)

182.7 (67) 186.3 (70)

F1_pre Bilabial 463.4 (87) 484.9 (110)

Coronal 566.9 (76) 645.6 (74)

Velar 583.1 (97) 611.2 (88)

537.8 (102) 559.2 (115)

f0_post Bilabial 182.2 (68) 186.0 (69)

Coronal 181.1 (70) 190.1 (75)

Velar 182.7 (69) 194.5 (78)

182.0 (69) 190.2 (74)

F1_post Bilabial 394.6 (49) 426.6 (79)

Coronal 428.0 (48) 452.9 (56)

Velar 406.6 (50) 438.5 (63)

409.7 (49) 435.7 (66)

Table B12. Means (ms) and standard deviations (in brackets) for closure duration of the first (C1) stop in obstruent clusters.

/d/ /t/ Total

C2 stop Slow Fast Slow Fast Slow Fast

Voiced 40.7 (9) 28.9 (7) 41.9 (11) 30.3 (7) 41.3 (10) 29.6 (7)

Voiceless 65.4 (10) 50.6 (10) 65.6 (14) 45.6 (11) 65.5 (12) 48.1 (11)

Total 53.1 (16) 39.7 (14) 53.8 (17) 37.9 (12) 53.4 (16) 38.8 (13)

Table B13. Means (ms) and standard deviations (in brackets) for voicing during closure of the underlying voiced and voiceless C1 stops in obstruent clusters.

/d/ /t/ Total

C2 stop Slow Fast Slow Fast Slow Fast

Voiced 40.4 (9) 28.9 (7) 41.6 (10) 30.3 (7) 41.0 (10) 29.6 (7)

Voiceless 11.3 (5) 11.2 (8) 11.3 (6) 10.6 (7) 11.3 (6) 10.9 (7)

Total 25.9 (16) 20.0 (12) 26.5 (17) 20.4 (12) 26.2 (17) 20.2 (12)

Page 204: Voicing and voice assimilation in Russian stops

184

184

Table B14. Mean VR for closure duration and percent of fully voiced C1 stops in obstruent clusters.

C2 stop Slow Fast Total

VR Voiced 99.3% 100% 99.7% Voiceless 17.4% 24.4% 20.7%

% of fully Voiced 96.4% 100% 98.2% voiced tokens Voiceless 0% 1.8% 0.9%

Table B15. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding a C1 stop in obstruent clusters.

Means (SD)

C2 /d/ /t/

f0 Voiced 176 (56) 176 (58)

Voiceless 178 (58) 179 (59)

177 (57) 178 (58)

F1 Voiced 447 (72) 469 (73)

Voiceless 520 (92) 559 (103)

484 (83) 514 (99)

Table B16. Means (ms) and standard deviations (in brackets) for closure duration of underlying voiced and voiceless final stops at three places of articulation.

Underlying voiced Underlying voiceless

Place Slow Fast Slow Fast

Bilabial 87.5 (21) 63.4 (17) 90.6 (14) 69.5 (12)

Coronal 75.5 (17) 50.2 (15) 78.8 (16) 50.2 (14)

Velar 76.7 (20) 54.0 (14) 75.8 (15) 55.0 (13)

Total 79.9 (20) 55.9 (16) 81.7 (16) 58.2 (15)

Page 205: Voicing and voice assimilation in Russian stops

185

185

Table B17. Means (ms) and standard deviations (in brackets) for voicing during closure of underlying voiced and voiceless final stops at three places of articulation.

Underlying voiced Underlying voiceless

Place Slow Fast Slow Fast

Bilabial 16.3 (8) 18.2 (11) 13.2 (7) 15.3 (9)

Coronal 9.5 (6) 13.3 (9) 12.7 (7) 11.7 (8)

Velar 9.9 (6) 13.7 (9) 9.0 (6) 13.7 (10)

Total 11.9 (7) 15.5 (11) 11.6 (7) 13.6 (9)

Table B18. Mean VR of final underlying voiced and voiceless stops and percent of fully voiced final stops at three places of articulation.

Mean VR % of fully voiced stops

Place Underlying voiced

Underlying voiceless

Underlying voiced

Underlying voiceless

Slow Fast Slow Fast Slow Fast Slow Fast

Bilabial 19.9% 32.5% 15.2% 23.3% 0% 3.6% 0% 0%

Coronal 13.5% 17.1% 17.1% 26.1% 0% 3.6% 0% 0%

Velar 13.2% 12.3% 12.3% 25.7% 0% 3.6% 0% 0%

Total 15.6% 29.9% 14.9% 25.0% 0% 3.6% 0% 0%

Table B19. Means (Hz) and standard deviations (in brackets) for f0 and F1 before underlying voiced and voiceless stops at three places of articulation.

Means (SD)

Place Voiced Voiceless

f0 Bilabial 197 (82) 198 (85)

Coronal 200 (89) 204 (88)

Velar 201 (86) 203 (85)

200 (61) 202 (85)

F1 Bilabial 406 (74) 422 (70)

Coronal 444 (69) 460 (76)

Velar 393 (58) 419 (72)

414 (74) 434 (77)

Page 206: Voicing and voice assimilation in Russian stops

186

186

APPENDIX C

RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 2

Table C1. Means (ms) and standard deviations (in brackets) of closure duration of underlying /d/ and /t/ in clusters in a word and a preposition.

Underlying Following Slow Fast

segment segment Clitic Word Clitic Word

/d/ Sonorant 60.1 (15) 55.2 (13) 47.9 (18) 40.9 (12)

Voiced 54.6 (13) 40.7 (9) 42.3 (12) 28.9 (7)

Voiceless 55.9 (14) 65.4 (10) 43.6 (14) 50.6 (10)

56.9 (15) 53.8 (20) 44.6 (14) 40.1 (14)

/t/ Sonorant 75.3 (16) 77.6 (14) 58.8 (14) 49.9 (15)

Voiced 55.2 (12) 41.9 (11) 42.6 (10) 30.3 (7)

Voiceless 62.2 (11) 65.6 (14) 48.8 (12) 45.6 (11)

64.4 (15) 61.7 (18) 50.2 (14) 41.9 (14)

Total 60.7 (12) 57.7 (11) 47.5 (10) 41.0 (7)

Note: Grand means are given in italics.

Table C2. Means (ms) and standard deviations (in brackets) of voicing duration of underlying /d/ and /t/ in clusters within a word and a preposition.

Underlying Following Clitic Word

segment segment Slow Fast Slow Fast

/d/ Sonorant 58.1 (15) 47.1 (16) 52.0 (15) 40.5 (12)

Voiced 53.8 (12) 42.0 (12) 40.4 (9) 28.9 (7)

Voiceless 17.7 (11) 18.6 (9) 11.3 (5) 11.2 (8)

43.3 (20) 35.9 (14) 34.6 (16) 26.9 (12)

/t/ Sonorant 13.5 (8) 16.0 (8) 12.5 (6) 13.2 (8)

Voiced 52.0 (14) 40.6 (10) 41.6 (10) 30.3 (7)

Voiceless 16.0 (9) 17.6 (9) 11.3 (6) 10.6 (7)

26.8 (23) 24.4 (17) 21.8 (19) 18.0 (14)

Total 35.0 (14) 30.1 (10) 28.2 (10) 22.5 (7)

Note: Grand means are given in italics.

Page 207: Voicing and voice assimilation in Russian stops

187

187

Table C3. Mean VRs in underlying /t/ and /d/ within a word and a preposition.

Underlying Following Slow Fast

segment segment Clitic Word Clitic Word

/d/ Sonorant 97.5% 94.3% 99.0% 99.2%

Voiced 98.7% 99.3% 99.5% 100%

Voiceless 34.0% 17.8% 45.0% 22.9%

/t/ Sonorant 19.2% 16.4% 28.6% 28.5%

Voiced 95.2% 99.3% 96.0% 100%

Voiceless 26.7% 17.0% 38.3% 25.2%

Table C4. Percent of fully voiced underlying /d/ and /t/ in clusters within a word and a preposition.

Underlying Following Slow Fast

segment segment Clitic Word Clitic Word

/d/ Sonorant 92.9% 90.5% 94.5% 96.4%

Voiced 94.2% 96.4% 98.1% 100%

Voiceless 3.7% 0% 9.3% 0%

/t/ Sonorant 0% 0% 0% 3.6%

Voiced 90.6% 96.4% 90.6% 100%

Voiceless 1.9% 0% 3.5% 3.6%

Table C5. Means (ms) and standard deviations (in brackets) of duration of a vowel preceding underlying /d/ and /t/ in clusters within a word and a preposition.

Underlying Following Slow Fast

segment segment Clitic Word Clitic Word

/d/ Sonorant 81.9 (14) 127.3 (21) 71.0 (14) 84.8 (11)

Voiced 78.7 (13) 88.2 (10) 66.9 (16) 72.4 (11)

Voiceless 77.0 (11) 69.6 (12) 66.0 (13) 52.7 (11)

79.2 (13) 95.0 (28) 67.9 (14) 69.9 (17)

/t/ Sonorant 66.0 (18) 117.9 (17) 58.0 (15) 74.7 (11)

Voiced 70.1 (17) 84.3 (12) 56.5 (13) 63.9 (14)

Voiceless 65.6 (17) 65.2 (11) 55.4 (13) 49.0 (12)

67.2 (18) 89.2 (26) 56.6 (14) 62.5 (16)

Total 73.2 (16) 92.1 (27) 62.2 (15) 66.2 (17)

Note: Grand means are given in italics.

Page 208: Voicing and voice assimilation in Russian stops

188

188

Table C6. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding underlying /d/ and /t/ in prepositions in three types of clusters.

Means (SD)

Cue Following

segment

Underlying

voiced

Underlying

voiceless

f0 Sonorant 181 (60) 182 (62)

VD 183 (60) 183 (62)

VL 182 (59) 184 (61)

182 (60) 183 (62)

F1 Sonorant 534 (83) 529 (84)

VD 536 (84) 520 (83)

VL 529 (95) 520 (86)

533 (87) 523 (84)

Page 209: Voicing and voice assimilation in Russian stops

189

189

APPENDIX D

RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 3

Table D1. Means (ms) and standard deviations (in brackets) of closure duration of underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant.

/d/ /t/

Cluster C2 Slow Fast Slow Fast

C1-Son-C2 Voiced 57.8 (13) 54.9 (14) 64.7 (13) 50.3 (16)

Voiceless 64.4 (14) 57.2 (14) 75.3(17) 58.0 (15)

Total 61.1 (14) 56.1 (14) 70.0 (15) 54.2 (16)

C1-C2 Voiced 55.4 (9) 43.1 (8) 54.7 (11) 42.5(6)

Voiceless 55.5 (11) 44.2 (12) 62.3 (7) 48.4 (8)

Total 55.5 (10) 43.7 (10) 58.5 (10) 45.5 (7)

Note: Grand means are given in italics.

Table D2. Means (ms) and standard deviations (in brackets) of duration of voicing during closure of underlying /d/ and /t/ C1stops in clusters with and without an intervening sonorant.

/d/ /t/

Cluster C2 Slow Fast Slow Fast

C1-Son-C2 Voiced 50.9 (13) 46.4 (16) 17.2 (8) 17.0 (5)

Voiceless 51.5 (12) 48.5 (14) 18.2 (7) 19.7 (8)

C1-C2 Voiced 54.4 (9) 42.8 (8) 51.4 (10) 40.3 (8)

Voiceless 18.0 (7) 19.4 (4) 16.0 (6) 17.2 (7)

Table D3. Voicing ratios of underlying /d/ and /t/ C1 stops in clusters with and without an intervening sonorant.

/d/ /t/

Cluster C2 Slow Fast Slow Fast

C1-Son-C2 Voiced 91.0% 87.5% 28.4% 38.6%

Voiceless 82.9% 86.9% 24.8% 37.2%

Total 86.9% 87.2% 26.6% 37.9%

C1-C2 Voiced 98.7% 99.5% 95.2% 96.0%

Voiceless 34.0% 45.0% 26.7% 38.3%

Page 210: Voicing and voice assimilation in Russian stops

190

190

Table D4. Percent of fully voiced underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant.

/d/ /t/

Cluster C2 Slow Fast Slow Fast

C1-Son-C2 Voiced 81.6% 71.4% 5.5% 9.7%

Voiceless 64.0% 67.4% 1.8% 7.8%

Total 72.8% 69.4% 3.7% 8.8%

C1-C2 Voiced 94.2% 98.1% 90.6% 93.3%

Voiceless 3.7% 9.3% 1.9% 3.5%

Table D5. Means (ms) and standard deviations (in brackets) of duration of a vowel preceding underlying voiced and voiceless C1 stops in clusters with and without an intervening sonorant.

/d/ /t/

Cluster C2 Slow Fast Slow Fast

C1-Son-C2 Voiced 62.6 (17) 53.7 (15) 78.1 (13) 72.2 (15)

Voiceless 64.6 (21) 52.9 (15) 75.7 (17) 69.7 (21)

Total 63.6 (19) 53.3 (14) 76.9 (18) 70.9 (19)

C1-C2 Voiced 70.8 (17) 56.8 (14) 78.1 (13) 66.8 (16)

Voiceless 65.7 (18) 55.4 (13) 77.2(11) 66.1 (14)

Total 68.2 (17) 56.1 (13) 77.6 (15) 66.4 (15)

Table D6. Means (Hz) and standard deviations (in brackets) for f0 and F1 on a vowel preceding a C1 stop for underlying voiced and voiceless stops in obstruent-sonorant-obstruent clusters.

C2 Means (SD)

Cue Cluster voice /d/ /t/

f0 C1-Son-C2 Voiced 186 (58) 190 (62)

Voiceless 189 (61) 189 (61)

C1-C2 Voiced 187 (59) 188 (61)

Voiceless 186 (59) 190 (61)

Total 187 (59) 189 (61)

F1 C1-Son-C2 Voiced 507 (86) 512 (74)

Voiceless 509 (92) 514 (87)

C1-C2 Voiced 544 (81) 526 (83)

Voiceless 536 (95) 526 (87)

Total 524 (89) 519 (83)

Page 211: Voicing and voice assimilation in Russian stops

191

191

APPENDIX E

RESULTS OF ACOUSTIC MEASUREMENTS IN EXPERIMENT 4

Table E1. Mean durations of stop closure (ms) and standard deviations (in brackets) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions.

Following List

Slow

Fast segment /d/ /t/ /d/ /t/ /d/ /t/

Vowel 66.8 (11) 77.9 (12) 55.9 (12) 66.6 (14) 45.7 (12) 51.3 (13)

Voiced 55.3 (13) 61.5 (13) 47.7 (14) 53.9 (12) 44.6 (13) 39.1 (9)

Voiceless 52.9 (12) 57.9 (11) 46.7 (14) 48.3 (13) 38.9 (16) 38.1 (12)

Total 58.3 (13) 65.8 (15) 50.1 (14) 56.3 (15) 43.1 (14) 42.9 (13)

Table E2. Mean durations of voicing during closure (ms) and standard deviations (in brackets) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions.

Following List

Slow

Fast segment /d/ /t/ /d/ /t/ /d/ /t/

Vowel 63.4 (11) 15.2 (9) 55.3 (13) 15.6 (9) 43.6 (15) 18.0 (11)

Voiced 51.6 (15) 25.9 (20) 42.4 (9) 27.3 (18) 44.0 (14) 29.9 (12)

Voiceless 18.5 (8) 15.1 (8) 19.2 (11) 14.2 (7) 19.3 (9) 18.3 (8)

Total 44.5 (22) 18.7 (14) 39.0 (19) 19.0 (13) 35.7 (17) 21.8 (12)

Table E3. Mean voicing ratios (VR) for underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions.

Following List

Slow

Fast segment /d/ /t/ /d/ /t/ /d/ /t/

Vowel 95.9% 20.6% 98.8% 24.8% 94.8% 39.2%

Voiced 93.7% 44.8% 92.5% 55.5% 98.4% 79.1%

Voiceless 37.3% 26.4% 42.8% 32.2% 56.9% 51.5%

Page 212: Voicing and voice assimilation in Russian stops

192

192

Table E4. Percent of fully voiced stops before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions.

Following List

Slow

Fast segment /d/ /t/ /d/ /t/ /d/ /t/

Vowel 89.3% 0% 96.4% 0% 89.3% 10.7%

Voiced 85.7% 25.0% 82.1% 32.1% 96.3% 61.5%

Voiceless 0% 0% 3.6% 3.6% 25.9% 7.1%

Table E5. Mean durations (ms) and standard deviations (in brackets) of vowels preceding underlying /d/ and /t/ before /v/ followed by a vowel, a voiced stop, and a voiceless stops in the list, slow, and fast rate conditions.

Following List

Slow

Fast segment /d/ /t/ /d/ /t/ /d/ /t/

Vowel 78.7 (14) 59.7 (15) 74.4 (9) 64.6 (11) 68.4 (11) 48.6 (10)

Voiced 69.9 (13) 63.6 (13) 75.4 (15) 62.7 (19) 66.0 (15) 47.9 (11)

Voiceless 65.6 (12) 58.0 (13) 70.3 (12) 55.6 (14) 63.8 (15) 43.9 (10)

Total 71.4 (14) 60.4 (14) 73.4 (12) 60.9 (15) 66.0 (14) 46.8 (10)