a computational phonology of russian - peter a. chew.pdf

Upload: sorokz

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    1/425

    A Computational Phonology of Russian

    byPeter A. Chew

    ISBN: 1-58112-178-4

    DISSERTATION.COM

    Parkland, FL USA 2003

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    2/425

    A Computational Phonology of Russian

    Copyright 2000 Peter A. Chew

    All rights reserved.

    Dissertation.com

    USA 2003

    ISBN: 1-58112-178-4www.Dissertation.com/library/1121784a.htm

    http://www.dissertation.com/library/1121784a.htmhttp://www.dissertation.com/library/1121784a.htm
  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    3/425

    A Computational Phonology of Russian

    Peter Chew

    Jesus College, University of OxfordD. Phil. dissertation, Michaelmas 1999

    Abstract

    This dissertation provides a coherent, synchronic, broad-coverage, generativephonology of Russian. I test the grammar empirically in a number of ways todetermine its goodness of fit to Russian. In taking this approach, I aim to avoidmaking untested (or even incoherent) generalizations based on only a handful ofexamples. In most cases, the tests show that there are exceptions to the theory, but atleast we know what the exceptions are, a baseline is set against which future theoriescan be measured, and in most cases the percentage of exceptional cases is reduced tobelow 5%.

    The principal theoretical outcomes of the work are as follows. First, I show that all ofthe phonological or morphophonological processes reviewed can be described by agrammar no more powerful than context-free.

    Secondly, I exploit probabilistic constraints in the syllable structure grammar toexplain why constraints on word-marginal onsets and codas are weaker than on word-internal onsets and codas. I argue that features such as [!initial] and [!final], andextraprosodicity, are unnecessary for this purpose.

    Third, I claim that !"! should be lexically unspecified for the feature [!sonorant], andthat the syllable structure grammar should fill in the relevant specification based on itsdistribution. This allows a neat explanation of the voicing assimilation properties of!"!, driven by phonotactics.

    Fourth, I argue that jers in Russian should be regarded as morphological objects, notsegments in the phonological inventory. Testing the grammar suggests that whileepenthesis cannot be regarded as a major factor in explaining vowel-zero alternations,it might be used to explain a significant minority of cases.

    Fifth, I suggest that stress assignment in Russian is essentially context-free, resultingfrom the intersection of morphological and syllable structure constraints. I show thatmy account of stress assignment is simpler than, but just as general as, the best of thethree existing theories tested.

    Finally, this dissertation provides new insight into the nature and structure of theRussian morphological lexicon. An appendix of 1,094 morphemes and 1,509allomorphs is provided, with accentual and jer-related morphological informationsystematically included.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    4/425

    _______________________________

    A Computational Phonology of Russian

    by

    Peter Chew

    University of OxfordJesus College

    Michaelmas 1999

    _______________________________

    Thesis submitted for the degree of Doctor of Philosophyat the University of Oxford

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    5/425

    Acknowledgements

    I would like to thank my supervisor, John Coleman, for his help. Without hisencouragement and support even before I embarked upon this research, I woulddoubtless now be a well-paid but bored chartered accountant. Auditing linguistictheories has proved to be more rewarding in many ways than auditing financialstatements, and I am confident that the choice of leaving my previous job to pursuethis research was the right one.

    It would not have been possible to complete this D. Phil. without the support of mywife, Lynn. She has always been there to give practical suggestions, as a soundingboard for ideas, and simply as a partner in life, sharing encouraging and discouragingtimes together. God could not have given me a better wife.

    My parents have also been a great practical help, babysitting almost weekly, havingus round for meals, and generally helping reduce the stress in our lives. AlthoughJonathan, who was born 15 months before I submitted this thesis, has taken time frommy studies, we are very grateful for his arrival. I cannot think of a better way ofspending my time, and I cannot imagine a better son.

    A number of people have read drafts of my work or listened to me, giving helpfuladvice which enabled me to sharpen my thoughts and improve the way in which Iexpressed them. Thanks (in alphabetical order) to Dunstan Brown, Bob Carpenter,Andrew Hippisley, Mary MacRobert, Stephen Parkinson, Burton Rosner, IrinaSekerina, Andrew Slater, and Ian Watson. Andrew Slater has also provided invaluabletechnical support. I often feel that he puts the rest of us to shame with his good

    humour, helpfulness, and a constant willingness to go the extra mile.

    My friends at the Cherwell Vineyard Christian Fellowship have provided adependable support network which has kept Lynn and me going through not alwayseasy times. First and foremost, they have encouraged us to keep looking towards theone without whom we can do nothing. However, I know I will also look back on thelaughs we have had with Richard and Janet Remmington, Evan and EowynRobertson, Judy Irving, and others, on Thursday evenings with fond memories.

    Finally, I would like to thank my college, Jesus College, for providing very generousfinancial support throughout my time at Oxford. And without the financial support ofthe Arts and Humanities Research Board (formerly the British Academy), I would nothave undertaken this research project in the first place.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    6/425

    List of abbreviations and symbols

    General symbols

    !! enclose phonemic representations, e.g. !#$%&'!

    () enclose phonetic representations, e.g. (*$%&')

    !#c++,#-rn+in! denotes morphological tokenization; subscripts classify individualmorphs+ morpheme boundary. syllable boundary

    / denotes word-stress in IPA transcriptions (stress on the vowel to theright)!+,#-!rn denotes a single morpheme (classificatory subscript is outsideobliques)" syllable# the empty stringanter anterior C any consonantCFG context-free grammar cons consonantalcont continuantcoron coronalDCG (Prolog) Definite Clause Grammar del_rel delayed releaseinit initial

    later lateralOT Optimality TheoryPSG phrase structure grammar sonor sonorantSSG Sonority Sequencing GeneralizationV any vowelvfv vocal fold vibrationvoc vocalic

    Symbols used in morphological tokenization

    r* root

    s suffixc* clitici inflectional endingp pronominala adjectivaln substantivalv verbal

    d durative process

    r* resultative processi iterative processc* completed process

    *No ambiguity arises with respect to the use of non-unique symbols, because themeaning of each symbol is also dependent on its position; full details are given insection 3.2.1.2.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    7/425

    5

    Table of contents

    Acknowledgements ..................................................................................................................................3List of abbreviations and symbols............................................................................................................4Table of contents ......................................................................................................................................5Table of figures.........................................................................................................................................7List of tables.............................................................................................................................................8Chapter 1: Introduction.............................................................................................................................9

    1.1 Introduction................................................................................................................................91.2 Why computational linguistics? .............. ............. .............. .............. ............. .............. .......... 131.3 The framework.............. ............. .............. .............. ............. .............. ............. ............ ............. .16

    1.3.1 Phrase-structure grammar.............. ................ ............. ............ ............... ................ ......... 161.3.2 Context-free grammar.......... .............. ............. .............. .............. ........... ................ ......... 19

    1.4 The methodology .....................................................................................................................241.5 The dataset used for the tests....... ................ .............. ............. .............. .............. ............... ....... 261.6 Summary..................................................................................................................................30

    Chapter 2: Syllable structure .............. ............. .............. .............. ............. .............. .............. ............. .....322.1 Overview and aims........ ............. .............. ............. ............ ................ ............. ............ ............. .322.2 The syllable in phonological theory ............. .............. .............. ............... ................ ............. .... 34

    2.2.1 Sonority and syllable structure ........... .............. ........... ................ ........... .............. .......... 372.2.2 Morpheme structure constraints or syllable structure constraints? ............. ............. ....... 402.2.3 Syllable structure assignment ............. ............. ............ ............... ................ ........... ......... 43

    2.2.3.1 Kahns (1976) syllable structure assignment rules ............ ............. .............. ........... ..452.2.3.2 Its (1986) method of syllable structure assignment ........... ............. ........... ........... ..492.2.3.3 Syllable structure assignment in Optimality Theory.................. ............. .............. .....512.2.3.4 Phrase-structure analysis of syllable structure ............. .............. .............. ............. .....542.2.3.5 Syllable structure assignment: conclusions........... .............. ............. .............. ........... .56

    2.3 A linear grammar of Russian syllable structure ............. ........... .............. ............. .............. ...... 582.3.1 The phonological inventory of Russian ............ ............. .............. ........... ............... ......... 59

    2.3.1.1 Preliminaries: controversial issues.............. ............ ................ ............. ............ .......... 592.3.1.2 The classification system ........... .............. ............. .............. ............. .............. ............ 68

    2.3.2 The syllable structure rules............ ................ ............. ............ ................ .............. .......... 722.4 A heuristic for deciding between multiple syllabifications ............. ............ ................ ............. 952.5 Extensions to the grammar.............. ............... ................ ............. .............. ............. ............ ...... 99

    2.5.1 Further phonological features ............. ............. ............ ............... ............... ................ ...1022.5.2 Four phonological processes in Russian........... ............. .............. ............. ............ ........ 105

    2.5.2.1 Consonant-vowel interdependencies............. ............. ............ ............... ............... .... 1052.5.2.2 Reduction of unstressed vowels ............. .............. ........... ............... ............... ........... 1142.5.2.3 Word-final devoicing............. ........... .............. ............. .............. ........... ............... .... 1202.5.2.4 Voicing assimilation ................................................................................................127

    2.5.3 A test of the extensions to the grammar.............. ........... ................ ............. .............. ....1412.6 Summary................................................................................................................................146

    Chapter 3: Morphological structure......................................................................................................1493.1 Introduction and aims..... .............. .............. ............. .............. ............. .............. ............. ......... 149

    3.1.1 Generative approaches to word-formation............... ............... ............... ................ ....... 152

    3.1.2 Morphology and context-free grammar........... ............. ............. ............ ............. .......... 1583.2 A linear grammar of Russian word-formation ............. ............ ............. .............. ........... ........ 161

    3.2.1 The morphological inventory of Russian........ .............. ........... .............. ............. .......... 1613.2.1.1 Preliminaries: controversial issues.............. ............ ................ ............. ............ ........ 1643.2.1.2 The classification system ........... .............. ............. .............. ............. .............. .......... 165

    3.2.2 The word-formation rules ........... ............. ............ ................ ........... ................ ............. .1703.2.2.1 Words with no internal structure....... ............... ................ ............. ............ ............... 1713.2.2.2 Nouns.......................................................................................................................1723.2.2.3 Verbs........................................................................................................................1783.2.2.4 Prefixation................................................................................................................180

    3.3 Vowel-zero alternations in context-free grammar...... .............. ............. .............. ............... ....1853.4 A heuristic for deciding between multiple morphological analyses...... .............. .............. ..... 202

    3.4.1 Assigning costs to competing analyses...... .............. ............. .............. ............. ............. 2053.4.2 Should the cost mechanism be based on hapax legomena?..........................................209

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    8/425

    6

    3.5 Tests of the word-formation grammar.............. .............. ............. ............ ................ ............. ..2143.5.1 Test of coverage of the word-formation grammar........... ............... ................ ........... ...2153.5.2 Test of the grammars treatment of vowel-zero alternations .............. ........... ............... 218

    3.6 Conclusion .............................................................................................................................222

    Chapter 4: Stress assignment: three existing theories...........................................................................2244.1 Introduction............................................................................................................................224

    4.1.1 Two approaches to stress in Russian: the Slavist and the generative approaches......... 2244.1.2 Aims of this chapter...... .............. .............. ............. .............. .............. ............. .............. 232

    4.2 Three theories of stress assignment..... .............. ............. .............. ............... ................ ........... 2334.2.1 Halle (1997)..... .............. ............. .............. ............. .............. ............. .............. ............. .2334.2.2 Melvold (1989)............ .............. ............. .............. ........... ............... ................ ........... ...2374.2.3 Zaliznjak (1985) ...........................................................................................................244

    4.3 Derivational theories and underdeterminacy............ .............. .............. ............. .............. .......2484.3.1 Computing underlying accentuations by brute force.................. .............. .............. ....2514.3.2 Backwards phonology and the Accent Learning Algorithm..... .............. ........... ........... 252

    4.3.2.1 A concise encoding of solutions....... ............ ............. ............. ............ ............. ...... 2574.3.2.2 Formalization of the Accent Learning Algorithm....... ................ ............. .............. ..2594.3.2.3 A small-scale demonstration of the ALA on a non-problem combination............... 261

    4.3.2.4 Problem words ........... ............. .............. .............. ............. .............. ............. ............. 2714.3.2.5 Modifications to the ALA to allow for different theories ......... ................ ............. ..2744.3.2.6 Conclusions from the ALA............... ................ ........... .............. ............. .............. ...278

    4.3.3 Unique specification of the morpheme inventory by defaults ............. .............. ........... 2834.4 Tests to ascertain the coverage of the three theories ............ ................ .............. ............. .......291

    4.4.1 Test of Halles theory on non-derived nouns.......... ............ ............. ........... .............. ....2924.4.2 Test of Halles theory on non-derived and derived nouns ........... .............. ............. ...... 2934.4.3 Test of Melvolds theory on non-derived and derived nouns ............. ............. ............ .2944.4.4 Test of Melvolds theory on nouns, non-reflexive verbs and adjectives.............. ......... 2954.4.5 Test of Zaliznjaks theory on nominative singular derived nouns............ ............. ....... 2964.4.6 Test of Melvolds theory on nominative singular derived nouns............... ........... ........ 2974.4.7 Analysis of errors in Melvolds and Zaliznjaks theories ................ ........... .............. ....298

    4.5 Summary ................................................................................................................................307Chapter 5: Stress assignment: a new analysis.......................................................................................309

    5.1 Introduction............................................................................................................................3095.2 Context-free phonology and stress in Russian ............. ............ ............. .............. ............. ...... 3115.2.1 Encoding which morpheme determines stress .............. ............. ............ ................ ....... 3125.2.2 Polysyllabic morphemes.... ............... ............... ............ ................ ............. ............ ........ 3185.2.3 Post-accentuation..........................................................................................................3195.2.4 Jer stress retraction .............. ............. ............ ................ ............. ............ ............... ........ 3255.2.5 Plural stress retraction............. .............. ............... ................ ............. .............. ............. .3295.2.6 Dominant unaccented morphemes...... .............. ............. .............. ............. ............ ........ 3335.2.7 Concluding comments about the context-free phonology ........... ............. ............ ........ 336

    5.3 A test of the entire grammar.......... .............. .............. ............. .............. .............. ............. ....... 3385.4 Conclusions............................................................................................................................343

    Appendix 1: Russian syllable structure grammar.................................................................................346Appendix 2: Russian word-formation grammar ................ ............. .............. ............. ............ ............... 355Appendix 4: Morphological inventory ........... .............. ............. .............. .............. ............. .............. ....358

    Appendix 5: The computational phonology as a Prolog Definite Clause Grammar.............................392References. ...........................................................................................................................................413

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    9/425

    7

    Table of figures

    Figure 1. The Chomsky Hierarchy ......... ................ .............. ........... ................ .............. ............. ............ 20

    Figure 2. Classification of analyses of an imperfect grammar ............. .............. .............. ............... ....... 25Figure 3. Tree-structure for!+%+%! .......................................................................................................75Figure 4. Lattice showing the hierarchy of Russian phoneme classes..................................................110Figure 5. The Russian vowel system .............. ............. .............. .............. ............... ................ ............. .115Figure 6. The Russian vowel system in unstressed positions .............. ................ .............. ................ ...116Figure 7. Partial syllabic structure of pretonic !%! after a [$back] consonant ............ ............. ............. .119Figure 8. Tree-structure for!"#$%&!0'1$'2%!(0'1&'.23)..........................................................................138Figure 9. Parse tree for'"#()*)+,-.$!4'1$"',5,+6#&'! ......................................................................158Figure 10. Examples of subtrees from Figure 9....................................................................................159Figure 11. Morphological tokenization of'"#()*)+,-.$!4'1$"',5,+6#&'!.........................................160Figure 12. Parse tree for'"#()*)+/0!4'1$"',5,+,7!..........................................................................161Figure 13. Oliveriuss (1976) tokenization of*"'1)'&!5148&8',4%! woman....................................175

    Figure 14. Parse tree for*"'1)'&!5148&8',4%! woman.....................................................................175Figure 15. Three alternative representations of!906c+&8',&rv+%svi+&'sa!...................................................181

    Figure 16. Representation of the morpheme !-%#2!%!-%#62! weasel ...................................................190Figure 17. Structure of#"'23"% ...........................................................................................................199Figure 18. Structure of4,#,*56& ........................................................................................................200Figure 19. Structure of4,#*,5 ............................................................................................................201Figure 20. Parse tree for-4"7)&6$',-.$ (with log probabilities)........................................................208Figure 21. Rank-frequency graph.........................................................................................................213Figure 22. Analysis of coverage of morphology grammar ........... ................ ........... .............. ............. ..217Figure 23. Parse tree for-4"7)&6$',-.$..............................................................................................314Figure 24. Morphological/phonological structure of#8!&%&:!$;0ra+%2sn+%in3!.....................................322Figure 25. The constraint pool..............................................................................................................324Figure 26. Morphological/phonological structure of#8!&:%!$;0ra+%2sn+in1!.........................................327

    Figure 27. Morphological/phonological structure of(/-,:./!",#ra+/6&sn1+,in!.....................................332Figure 28. Morphological/phonological structure of8:!,("'$ //;c+06"'ra+14'sn+in! ..............................335

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    10/425

    8

    List of tables

    Table 1. Types of rules permitted by grammars in the Chomsky Hierarchy.............. ................ ............ 20Table 2. Analysis of words in on-line corpus ............. .............. ............. .............. .............. ............. ........ 30

    Table 3. Russian morpheme structure constraints on consonant clusters ............... ................ ............. ...41Table 4. Reanalysis of morpheme-medial clusters using syllable structure ............. .............. .............. ..42Table 5. Phonological inventories of different scholars ............. .............. .............. ........... ................ .....65Table 6. The phonemic inventory of Russian.........................................................................................67Table 7. Classification of Russian phonemic inventory ........... ................ ............. ............ ............... ...... 69Table 8. Distribution of word-initial onsets by type...............................................................................77Table 9. Distribution of word-final codas by type..................................................................................88Table 10. Further coda rules...................................................................................................................90Table 11. Exhaustive list of initial clusters not accounted for................................................................91Table 12. Exhaustive list of final clusters not accounted for..................................................................92Table 13. The twelve most frequently applying onset, nucleus and coda rules......................................97Table 14. Feature matrix to show classification of Russian phonemes and allophones with respect to all

    features........................................................................................................................................103Table 15. Allophonic relationships in consonant-vowel sequences ................ ........... .............. ............ 107

    Table 16. Allophones of!%! and !6! ......................................................................................................117Table 17. Results of phoneme-to-allophone transcription test ............ ............... ................ .............. ....145Table 18. Classification system for substantival inflectional morphs...................................................169Table 19. Further categories of morphological tokenization................................................................173Table 20. Summary of results of parsing 11,290 words ............. .............. ........... .............. ............. ...... 217Table 21. Derivations of six Russian words in accordance with Halle (1997).....................................237Table 22. Derivations of five Russian words in accordance with Melvold (1989)...............................242Table 23. Possible solutions for-.,6!#&/6-! table (nom. sg.).............................................................254Table 24. Possible solutions for-.,6&:

    Table 29. Demonstration that Melvolds theory is problematic ........... .............. ................ .............. ....279Table 30. Demonstration that Zaliznjaks theory is problematic..........................................................280Table 31. Number of candidate accentuations against..........................................................................282Table 32. Ranking of underlying morpheme forms..............................................................................288Table 33. Results of testing Halles theory on non-derived words.......................................................293Table 34. Results of testing Halles theory on non-derived and derived nouns....................................294Table 35. Results of testing Melvolds theory on non-derived and derived nouns...............................294Table 36. Results of testing Melvolds theory......................................................................................295Table 37. Results of testing Zaliznjaks theory ........... .............. .............. ................ ............. ............ ....297Table 38. Results of testing Melvolds theory......................................................................................297Table 39. Analysis of words incorrectly stressed by Melvolds theory................................................299Table 40. Analysis of words incorrectly stressed by Zaliznjaks theory ........... ............... ................ ....300Table 41. Exceptions common to Zaliznjaks and Melvolds theories.................................................302Table 42. Prefixed nouns stressed incorrectly by Zaliznjak ............. ............. ............ ................ ........... 303

    Table 43. Words derived from prefixed stems ............. ................ ........... ................ ........... .............. ....305Table 44. Further words derived from prefixed stems..........................................................................305Table 45. Results of testing the overall phonology for its ability to assign stress .............. .............. ....340Table 46. Results of testing Melvolds theory on 4,416 nouns.............................................................341

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    11/425

    9

    Chapter 1: Introduction

    1.1 Introduction

    This dissertation provides a coherent, synchronic, broad-coverage, generative

    account of Russian phonology. By broad-coverage, I mean that it will cover a

    number of phonological phenomena (stress assignment, syllabification, vowel-zero

    alternations, word-final devoicing, voicing assimilation, vowel reduction, and

    consonant-vowel interdependencies) within a single constrained grammar. While I

    have not attempted to deal exhaustively with all the phonological problems of interest

    in Russian (for example, I do not attempt to account for all morphophonological

    alternations), the current work covers those areas which have attracted the most

    attention in the literature on Russian phonology.

    While all these aspects of Russian phonology have been richly documented,

    generally they have been dealt with in isolation; the one notable exception to this is

    Halles (1959) Sound Pattern of Russian. The following quotation (op. cit., p. 44)

    serves to show that Halles account of Russian phonology is also intended to be

    broad-coverage in the sense just outlined:

    When a phonological analysis is presented, the question always arises as to whatextent the proposed analysis covers the pertinent data. It is clearly impossible in a

    description to account for all phonological manifestations in the speech of even asingle speaker, since the latter may (and commonly does) use features that arecharacteristic of different dialects and even foreign languages. (E.g., a speaker ofRussian may distinguish between nasalized and nonnasalized vowels in certain[French] phrases which form an integral part of his habitual conversationalrepertoire.) If such facts were to be included, all hopes for a systematic descriptionwould have to be abandoned. It is, therefore, better to regard such instances asdeviations to be treated in a separate section and to restrict the main body of thegrammar to those manifestations which can be systematically described.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    12/425

    10

    The aim of the current work is thus substantially the same as that of Halle

    (1959). However, in the forty years since then there have been a number of advances,

    both linguistic and technological, which allow us to take a fresh (and perhaps more

    rigorous) look at some of the same phenomena which Halle and others attempted to

    describe. In the late 1950s and early 1960s Chomsky and co-workers pioneered work

    in developing a formal theory of language (Chomsky 1959, 1963, 1965); this work

    established clearly-defined links between linguistics, logic and mathematics, and was

    also foundational in computer science in the sense that the principles it established

    have also been applied in understanding computer programming languages. These

    advances make it possible to formulate a theory of Russian phonology, just as Halle

    did, but to test it empirically by implementing the theory as a computer program and

    using it to process very large numbers of words. Moreover, since the technological

    advances which make it possible to do this owe a great deal to Chomskys work, the

    transition from generative grammar to computational grammar can be a comparatively

    straightforward one.

    One of the defining features of generative grammar is the emphasis on

    searching for cross-linguistic patterns. Without denying the value of language-specific

    grammar, Chomsky and Halle (1968) (to many the canonical work of generative

    phonology) illustrates this thinking:

    ...we are not, in this work, concerned exclusively or even primarily with the facts ofEnglish as such. We are interested in these facts for the light they shed on linguistictheory (on what, in an earlier period, would have been called universal grammar)and for what they suggest about the nature of mental processes in general Weintend no value judgment here; we are not asserting that oneshouldbe primarilyconcerned with universal grammar and take an interest in the particular grammar ofEnglish only insofar as it provides insight into universal grammar and psychologicaltheory. We merely want to make it clear that this is our point of departure in thepresent work; these are the considerations that have determined our choice of topicsand the relative importance given to various phenomena. (p. viii)

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    13/425

    11

    The emphasis on cross-linguistic generalization, characteristic of Chomskys

    work, has characterized generative linguistics ever since: indeed, there is a

    considerable branch of linguistics (Zwicky 1992 is an example) which abstracts

    completely away from language-specific data. (This branch deals in what Zwicky

    1992: 328 refers to as frameworks as opposed to theories.) While frameworks have

    their place (indeed, a theory cannot exist without a framework), the difficulty is

    always that frameworks cannot be verified without theories. In this light, Chomsky

    and Halle (1968) claimed to establish both a cross-linguistic framework and a theory

    about English phonology.

    The focus of this description is on ensuring that the phonology of Russian

    proposed is both internally consistent and descriptively adequate&that is, that it

    makes empirically correct predictions about Russian&rather than on attempting to

    develop any particular linguistic framework. Exciting possibilities are open in this line

    of research thanks to the existence of computer technology. It is possible to state

    grammatical rules in a form which has the rigour required of a computer program, and

    once a program is in place, large corpora can be quickly processed. Thus the

    phonology of Russian presented here is computational simply because of the

    advantages in speed and coverage that this approach presents.

    Establishing that a linguistic theory can be implemented as a computer

    program and verifying its internal consistency in this way is a valuable exercise in

    itself, but non-computational linguists may be sceptical: some may argue that this

    kind of approach does not contribute anything to linguistics per se. Whether or not

    this is criticism is well-founded (and I believe it is not), I hope that this dissertation

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    14/425

    12

    will satisfy even the more stringent critics by making a number of key contributions to

    linguistic knowledge. These are as follows.

    First, I propose that both the distribution of!"! and its behaviour with respect

    to voicing assimilation can be explained if!"!, unlike all other segments in the

    phonological inventory of Russian, is lexically unspecified for the feature [!sonorant].

    The syllable structure rules determine whether!"! is [+sonorant] or [$sonorant], and

    this in turn determines how !"! assimilates in voice to adjacent segments.

    Second, I suggest that the greater latitude allowed in word-marginal onsets and

    codas, which is a feature of Russian and other languages (cf. Rubach and Booij 1990),

    can be explained naturally by a probabilistic syllable structure grammar. This

    approach allows features such as [!initial] and [!final] (cf. Dirksen 1993) to be

    dispensed with.

    Third, I show that vowel-zero alternations in Russian cannot fully be

    explained by a Lexical-Phonology-style account (such as that proposed by Pesetsky

    ms 1979) alone, nor can they be the result of epenthesis alone. I show empirically that

    a combination of factors, including (1) the morphophonological principles discovered

    by Pesetsky, (2) epenthesis, and (3) etymology, governs vowel-zero alternations.

    Fourth, I show that Russian stress can be accounted for with a high rate of

    accuracy by existing generative theories such as that of Melvold (1989), but I suggest

    a simpler theory which accounts for the same data with as good a rate of accuracy.

    The theory which I propose regards stress assignment as resulting from the interaction

    of morphological and syllable structure: existing generative theories do not

    acknowledge syllable structure as playing any role in Russian stress assignment. An

    integral part of my theory is a comprehensive inventory of morphemes together with

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    15/425

    13

    the accentual information which is lexically specified for each morpheme. The

    inventory which I propose, which is arrived at by computational inference, includes

    1,094 morphemes and 1,509 allomorphs, while the longest existing list of this type, as

    far as I am aware, is the index of approximately 250 suffixes in Redkin (1971).

    The structure of this dissertation is as follows. In this chapter, I set out in

    detail the concepts which are foundational to the whole work: the role which

    computation plays in my work (1.2), the framework which I use (1.3), and the

    methodology which underlies my work (1.4). Then, I discuss in detail aspects of the

    syllable structure and morphological structure of Russian in Chapters 2 and 3

    respectively, in each case developing a formally explicit grammar module which can

    be shown to be equivalent to a finite state grammar. Chapter 4 describes in detail three

    theories of stress assignment in Russian. These are tested computationally to ascertain

    which is the most promising. Each of Chapters 2-4 begins with a section reviewing

    the relevant literature. Finally, in Chapter 5, I describe how the principal features of

    the preferred theory from Chapter 4 can be incorporated into a synthesis of the

    grammars developed in Chapters 2 and 3. The result is an integrated, internally

    consistent, empirically well-grounded grammar, which accounts for a variety of

    different aspects of Russian phonology.

    1.2 Why computational linguistics?

    In this dissertation, computation is used as a tool. Any tool has limitations, of

    course: a large building cannot be built with a power drill alone, and, to be sure, there

    are problems in linguistics which computation is ill-suited to solve. On the other hand,

    anyone who has a power drill will try to find appropriate uses for it. Likewise, I aim

    to use computation for the purposes for which it is best suited. This, then, is not a

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    16/425

    14

    dissertation about computational linguistics; it is a dissertation that uses computation

    as a tool in linguistics.

    What, then, are the strengths of computational tools in linguistics? Shieber

    (1985: 190-193), noting that the usefulness of computers is often taken for granted by

    computational linguists, lists three roles that the computer can play in the evaluation

    of linguistic analyses: the roles of straitjacket (forcing rigorous consistency and

    explicitness, and clearly delineating the envelope of a theory), touchstone (indicating

    the correctness and completeness of an analysis), and mirror (objectively reflecting

    everything in its purview). In short, the process of implementing a grammar

    computationally forces one to understand in detail the mechanisms by which a

    grammar assigns structure. Shieber states, for example, that

    we have found that among those who have actually attempted to write a computer-interpretable grammar, the experience has been invaluable in revealing real errorsthat had not been anticipated by the Gedanken-processing typically used by linguists

    to evaluate their grammars&errors usually due to unforeseen interactions of variousrules or principles. (p. 192)

    This has also been my experience in developing the current phonology of

    Russian. In particular, areas such as stress assignment involve the interaction of a

    number of different grammar modules, and, as Shieber states, decisions in one part of

    the grammar, while internally consistent, may not cohere with interacting decisions in

    another part (Shieber 1985: 190). Problems of this kind cannot always feasibly be

    foreseen without actually implementing and testing a theory on a corpus of data.

    Another perhaps self-evident strength of computers is their ability to process

    large volumes of data quickly: once a grammar has been implemented, the processing

    can take place without intensive effort on the part of the researcher. While in principle

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    17/425

    15

    generative theories can be implemented and tested by hand, the volume of data that

    typically has to be processed to achieve significant results means that this is an

    extremely tedious and time-consuming, if not impracticable, task. Clearly,

    computational techniques shift the burden for the researcher from data processing to

    the more interesting task of developing theories, identifying exceptions quickly, and

    debugging the theory as appropriate.

    Because the discipline of computational linguistics is still relatively young, it

    is perhaps understandable that many existing theories have neither been implemented

    nor tested computationally, but now that the means to validate theories are widely

    available, it is less justifiable for new theories still to be proposed in linguistics

    withoutbeing empirically tested: the widespread practice of testing a few interesting

    cases is unreliable and is no substitute for an exhaustive check (Bird 1995: 14). It

    seems that at this stage in linguistic research, the efforts of linguists would be better

    directed towards implementing and testing existing theories rather than proposing new

    alternatives, since otherwise it cannot be demonstrated that the new alternatives

    measure up any better to the criteria of coverage, constrainedness and ability to

    integrate than the theories which they replace.

    It is also worth noting the limitations of computational analysis (which I set as

    the limits for this dissertation). Ultimately, computers follow instructions rather than

    making judgements, and while they are very good at evaluating grammars for

    consistency and descriptive adequacy, they cannot test for explanatory adequacy

    unless the programmer supplies the necessary information (that is, a standard against

    which to measure the accuracy of structures assigned by a grammar to strings). The

    judgement about the nature of the correct structures is a question of psychology, and

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    18/425

    16

    therefore I do not claim that the current phrase-structure context-free phonology of

    Russian is a psychological model. In this, my approach is exactly the same as that of

    Gazdar, Klein, Pullum and Sag (1985):

    We make no claims, naturally enough, that our grammatical theory is eo ipso apsychological theory. Our grammar of English is not a theory of how speakers thinkup things to say and put them into words. Our general linguistic theory is not a theoryof how a child abstracts from the surrounding hubbub of linguistic and nonlinguisticnoises enough evidence to gain a mental grasp of the structure of a natural language.Nor is it a biological theory of the structure of an as-yet-unidentified mental organ. Itis irresponsible to claim otherwise for theories of this general sort

    Thus we feel it is possible, and arguably proper, for a linguist (qua linguist) to ignorematters of psychology. But it is hardly possible for a psycholinguist to ignorelanguage If linguistics is truly a branch of psychology (or even biology), as is oftenunilaterally asserted by linguists, it is so far the branch with the greatest pretensionsand the fewest reliable results So far, linguistics has not fulfilled its own side of theinterdisciplinary bargain. (p. 5)

    1.3 The framework

    1.3.1 Phrase-structure grammar

    In this dissertation, phonology and morphology, as modules of grammar, have

    the function of enumerating or generating (the words of a) language. This view of

    grammatical modules is entirely in accordance with traditional generative linguistics

    (e.g. Chomsky and Miller 1963: 283-285). More precisely, a phonological grammar

    should be able to generate all and only the phonological words of a natural language;

    similarly, a word-formation grammar should enumerate all the morphological words

    (p-forms, in the terminology of Zwicky 1992: 334) of a natural language.1 The same

    1 As noted by Booij and Rubach (1984), there may well not be a one-to-one mapping betweenmorphological words and phonological words&well-known examples from Russian arepreposition-noun phrases, all of which have a single stress (e.g. 9&:!8%8!*/%

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    19/425

    17

    grammar that enumerates the forms of a language should also be able to assign them a

    structural description (that is, parse them). These functions are clearly fulfilled by

    phrase-structure grammars (PSGs), since in a PSG each rule can equivalently be

    thought of as a partial structure, and each derivation can be represented as a directed

    graph.

    The ability of a grammar to parse (that is, providesome structural description

    for the word) does not necessarily imply its ability to parse correctly. As Chomsky

    and Miller (1963: 297) state, we have no interest, ultimately, in grammars that

    generate a natural language correctly but fail to generate the correct set of structural

    descriptions. A grammar which is able to assign structural descriptions to all relevant

    well-formed utterances in a language is said to meet the condition ofdescriptive

    adequacy, while a grammar which meets the more stringent requirement of assigning

    correct structural descriptions to all well-formed utterances is said to meet the

    condition ofexplanatory adequacy. In general, it is considerably harder to prove or

    disprove a grammars explanatory adequacy than its descriptive adequacy, since the

    former is a matter not just of linguistic data but of psychology as well (Chomsky

    1965: 18-27). Moreover, it is important to realize that a parse should not necessarily

    be considered incorrect just because it was unanticipated: such a parse may in fact be

    a possible but unlikely parse. These factors all mean that establishing whether a given

    grammar assigns correct structural descriptions is not always straightforward, and is

    often a matter of judgement.

    Conversely, English words of the form non-X (whereXstands for an adjective) are a singlemorphological word, but two phonological words.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    20/425

    18

    Essentially, there are three good reasons for formulating a theory within the

    framework of PSG. First, PSGs are the standard means of assigning hierarchical

    constituent structure to strings, which is widely and uncontroversially regarded as an

    important function of linguistics. The literature on phrase-structure grammar has been

    developed over approximately 40 years, and owes much to Chomskys interest in

    establishing a formal foundation for linguistics (e.g. Chomsky 1959, Chomsky and

    Miller 1963, Chomsky 1963).

    A second strength of the PSG formalism is that it has a straightforward

    declarative interpretation. Phrase-structure grammar rules can equally validly be

    seen as partial descriptions of surface representations or descriptions of information

    structures, in Brown, Corbett, Fraser, Hippisley and Timberlakes (1996)

    terminology. Specifically, context-free phrase-structure rules can be represented

    graphically as tree structures (Coleman 1998: 99).

    Third, there is a transparent relationship between PSGs and Definite Clause

    Grammars (DCGs)2. This is perhaps the greatest advantage of using the PSG

    formalism, because it means that a PSG can easily be implemented and tested

    computationally. DCGs are a particular type of formalism available as part of the

    programming language Prolog. For details of the workings of Prolog and DCGs, the

    reader is invited to refer to a textbook on Prolog, such as Clocksin and Mellish

    (1981). Here, it is sufficient to appreciate that DCGs can fulfil the functions of parsing

    and generation, because Prolog is a declarative programming language. Thus, if a

    2 DCGs are capable of defining recursively enumerable languages and CFGs are capable of definingonly context-free languages (which are a subset of the set of recursively enumerable languages). Thus,to be more precise, the type of DCG used to implement the theory proposed in this dissertation is arestricted type of DCG.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    21/425

    19

    particular grammar is implemented as a DCG, it is possible to test the grammar

    computationally to determine whether it describe[s] all and only the possible forms

    of a language (Bird, Coleman, Pierrehumbert and Scobbie 1992). Throughout this

    dissertation, I describe computational tests of this kind to determine whether the

    different aspects of the grammar are accurate representations of the facts of the

    language.

    1.3.2 Context-free grammar

    Having established in section 1.3.1 why I use the framework of PSG, I now

    move on to explain the significance of my claim that nothing more powerful than a

    context-free grammar (CFG) is necessary to describe the facts of Russian phonology.

    The claim that CFG is sufficient is in contrast to McCarthy (1982), for example, who

    claims that phonology is context-sensitive (p. 201). (Coleman 1998: 81 observes that

    his phonology is an unrestricted rewriting system, since it is a context-sensitive

    grammar with deletion: see McCarthys (1) on p. 201.) In other respects, however,

    McCarthys aim is comparable to mine: McCarthy aims to provide a fair degree of

    coverage, particularly in Hebrew phonology and Arabic morphology (p. 2), including

    stress assignment.

    CFGs are one of a number of types of grammar formalism in the Chomsky

    Hierarchy (Chomsky 1959), represented in Figure 1. All of these grammar formalisms

    are members of the family of PSGs. The place of a particular grammar within the

    hierarchy is determined by the type of rules included in the grammar, as shown in

    Table 1 (adapted from Coleman 1998: 79).

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    22/425

    20

    Figure 1. The Chomsky Hierarchy

    Table 1. Types of rules permitted by grammars in the Chomsky Hierarchy3

    Type Grammar Rule typesallowed

    Conditions on symbols

    0 Unrestricted A ' B A ( (VT) VN)*B ( (VT) VN)*

    1 Context-sensitive A ' B/ C _ DandA '

    A ( VN,B ( (VT) VN)

    +,C,D ( (VT) VN)*

    2 Context-free A ' B A ( VN,B ( (VT) VN)*

    3 Right linear A ' aB A ( VN,B ( (VN) {}),a ( VT

    3 Left linear A ' Ba A ( VN,B ( (VN) {}),a ( VT

    3 Note to Table 1: Following Chomsky (1959) and Coleman (1998), VT represents the set of terminalsymbols, VN the set of non-terminal symbols, X* a sequence of zero or more Xs, X

    + a sequence of oneor more Xs, and the empty string.

    Linear grammar (type 3)

    Context-free grammar (type 2)

    Context-sensitive grammar (type 1)

    Unrestricted grammar (type 0)

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    23/425

    21

    Because there has been a considerable amount of work carried out in phrase-structure

    grammar, the properties of different types of PSG in the Chomsky Hierarchy are by

    now well understood. These properties are important to consider when formulating a

    theory, for reasons which will now be made clear.

    On a very general level, the more restricted the grammar formalism, the better.

    This follows, essentially, from the principle of Occams razor: as Coleman (1998: 80)

    points out, the goal in developing a formal theory of natural-language syntax or

    phonology, is to use a type of grammar which is as powerful as necessary, but as

    restrictive as possible. It should be acknowledged, however, that context-free

    grammars can in practice have a cost compared to more powerful types of grammars,

    in that more powerful grammars may describe the same phenomena more simply

    (with fewer features or more general rules, for example), and may even be able to

    parse and generate more efficiently in some cases (Weinberg 1988).

    However, there are other, perhaps more psychological, arguments in support

    of choosing a grammar formalism no more powerful than context-free. Bresnan and

    Kaplan (1982) set out a number of constraints that they suggest linguistic theory

    should impose on the class of possible grammars, and CFGs adhere to all but one of

    these constraints. The one constraint which CFG does not adhere to is the

    universality constraint, which assumes that the procedure for grammatical

    interpretation, mG, is the same for all natural language grammars G (Bresnan and

    Kaplan 1982: xlvii). It is significant that Bresnan and Kaplans grounds for stating

    that CFG does not adhere to this constraint come from syntax, not phonology:

    Bresnan, Kaplan, Peters, and Zaenen 1982 have shown that there is no context-freephrase-structure grammar that can correctly characterize the parse trees of Dutch.The problem lies in Dutch cross-serial constructions, in which the verbs arediscontinuous from the verb phrases that contain their arguments The results of

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    24/425

    22

    Bresnan, Kaplan, Peters, and Zaenen 1982 show that context-free grammars cannotprovide a universalmeans of representing these phenomena. (p. xlix)

    Of the other constraints, one is the creativity constraint. One of the claimed

    contributions of generative grammar to linguistics was the observation that if a

    grammar is to be an equally valid model both of linguistic perception and production,

    it should be able not only to assign structure to strings, but also to generate strings

    (hence the term generative grammar). This observation is, for example, one of the

    foundational tenets of Chomsky (1957). As noted by Matthews (1974: 219),

    generative linguistics was partly a reaction to structuralist linguistics, which (it was

    claimed) emphasized assignment of structure at the expense of generation. Despite the

    emphasis of generative linguists upon the generative, it is notable that context-

    sensitive grammars and those more powerful are notnecessarily reversible (Bear

    1990). However, CFGs do always have the property of reversibility: that is, they can

    be used either for generation or recognition.

    Another constraint which CFGs satisfy is Bresnan and Kaplans reliability

    constraint: that is, they can always accept or reject strings in a finite amount of time.

    One of the properties of context-free (and more restricted) languages is that of

    decidability (alternatively known as computability, Turing-decidability or

    recursiveness). A language L*is decidable if there is an algorithm for determining

    membership in L; in other words, L is decidable if there is a grammar which can

    decide whether a string is well- or ill-formed (a member of L or not) in a finite

    amount of time. Languages of type m, where m+ 1, are not necessarily decidable, but

    those of type n, where n > 1, are always decidable. Bresnan and Kaplan argue that

    natural languages must be decidable, since:

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    25/425

    23

    It is plausible to suppose that the ideal speaker can decide grammaticality byevaluating whether a candidate string is assigned (well-formed) grammaticalrelations or not. The syntactic mapping can thus be thought of as reliably computingwhether or not any string is a well-formed sentence of a natural language. This

    motivates the reliability constraintthat the syntactic mapping must provide aneffectively computable characteristic function for each natural language. (p. xl)

    The principal objection which has been raised to this assumption, and one

    which is noted by Bresnan and Kaplan, is that native speakers often do not do well at

    parsing garden path constructions such as The canoe floated down the river sankand

    The editor the authors the newspaper hired liked laughed. However, they suggest,

    plausibly, that these constructions do not disprove their hypothesis. After all, they

    argue, speaker-hearers can disambiguate these sentences and recover from the garden

    paths, given more (but not infinite) time, and possibly a pencil and paper.

    A third reason for choosing the formalism of CFG is that the ordering of the

    rules of CFGs will not affect the way in which they function or their end result

    (although the ordering ofapplication of rules may have an effect on the outcome). All

    forms and constraints in CFGs are partial descriptions of surface representations, no

    rules do not ultimately constrain surface forms, all constraints must be compatible and

    apply equally, and any ordering of constraints will describe the same surface form

    (Scobbie, Coleman and Bird 1996). The motivation for this Order-free Composition

    Constraint, as Bresnan and Kaplan (1982: xlv) call it, is the fact that complete

    representations of local grammatical relations are effortlessly, fluently, and reliably

    constructed for arbitrary segments of sentences (Bresnan and Kaplan 1982: xlv).

    Again, this does not hold for all types of grammar.

    There are thus a number of reasons why it is desirable to restrict a grammar so

    that it is no more powerful than context-free. To summarize, these are as follows:

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    26/425

    24

    , CFGs are a relatively restricted class of grammar, and we would like to choose themost restricted theory which will account for the facts;

    , CFGs have a generative as well as a recognitive interpretation;, CFGs are Turing-decidable;, the rules of CFGs need not be ordered in any particular way;, although CFGs have been shown to be unable to cope with all aspects of syntax,

    there is no evidence to suggest that they are insufficient as far as phonology is

    concerned.

    1.4 The methodology

    Generative linguists often claim that linguistics is a science. This claim is

    made for phonology, for example, in Halle (1959: 24). What is meant by this?

    Sommerstein (1977: 9) answers this question with respect to phonology as follows:

    In science we frame and test hypotheses. It does not matter in the least how thesehypotheses are arrived at in the first place; it is the exception rather than the rule foran interesting hypothesis to be reached by a mechanical procedure, such as phonemicanalysis essentially is. Rather, what makes a hypothesis scientific or unscientific iswhether it can be stated what kind of empirical evidence will tend to disconfirm it,and what kind will definitely refute it. And there is no reason why this generalscientific principle should not be valid for phonological analysis.

    Thus any grammar we propose has the status of a scientific theory that

    attempts to account for observed linguistic data. On a philosophical level, the dataexist independent of any grammar; in other words, the existence of sentences, words,

    etc., in a language does not depend on our ability to formulate grammar rules to

    account for them. The only way of determining how well a grammar really does fit

    the data is to test it empirically. One way in which scientific methodology can work is

    incrementally: we look at the cases where a theory does not fit the data and modify

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    27/425

    25

    the theory accordingly. One would hope that the coverage of each successive theory

    advanced using this kind of methodology would eventually approach 100%.

    I shall now elucidate what is meant here by the coverage of a linguistic

    theory. As we saw in 1.3.1, a given grammar may be descriptively but not

    explanatorily adequate, but the converse is not possible. It may also be neither

    descriptively nor explanatorily adequate, which means that it fails altogether to assign

    a structural description to some utterances. For an imperfect grammar of this type, the

    set of correctly parsed utterances will be a subset of the set of parsed utterances,

    which in turn will be a subset of the set of all utterances, as Figure 2 illustrates.

    Figure 2. Classification of analyses of an imperfect grammar

    There are three measures that we shall be interested in. The first of these is coverage,

    the number of utterances in Q as a percentage of the number of words in P. The

    second is correctness orstructural coherence, the number of utterances in R as a

    percentage of the number in P. The third is the number of utterances in R as a

    percentage of the number in Q. Arguably, the second of these is the best overall

    P: All words

    Q: Words assigned somestructural description

    R: Words assigned the correctstructural description

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    28/425

    26

    measure, but as we do not always have access to data which tell us what the correct

    structures are, in some cases we have to use the first instead. The third measure will

    be most relevant in Chapter 5, where we need to separate the issues of morphological

    structure and stress assignment in order to be able to do a like-for-like comparison

    between the phrase-structure phonology proposed in this dissertation and Melvolds

    theory.

    The methodology that underlies the current work is also incremental. In

    subsequent chapters I advance theories about the syllable structure and morphological

    structure of Russian words which are arrived at by trial and error: see, for example,

    (91) in 3.2. The process of actually checking the descriptive adequacy of a grammar is

    straightforward and well-suited to computational processing, since the latter is fast

    and reliable.

    1.5 The dataset used for the tests

    In order to test a grammar computationally, it is necessary to have some kind

    of lexical database which one can use as the dataset for the tests. As a minimum, the

    database used in the research described here has to give the following information for

    every word therein:

    , A phonological transcription, The position of the word-stress, The position of all morpheme boundaries within the word, The part of speech

    Additional information which would have been desirable for each word in the

    corpus, but was unobtainable on a systematic basis, was as follows:

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    29/425

    27

    , A phonetic transcription,

    The position of all syllable boundaries within the word

    Although there are many existing electronic corpora for different languages

    (including Russian), the requirements of the research described in this dissertation

    were such that no existing electronic corpus was adequate for the purpose. Thus part

    of the preliminary work necessary was to compile a purpose-made lexical database. In

    this section, I discuss how I did this.

    Oliverius (1976) contains a list of 2,493 morphologically tokenized words.

    However, these words are all headwords. There are two major reasons why it is

    desirable to extend Oliveriuss list to include inflected forms. First, if the dataset is

    restricted to the words in Oliverius (1976), a large portion of the vocabulary of

    Russian (all the inflected forms) is missed. This is unacceptable because the current

    dissertation explicitly deals with the application of phonological theories of Russian

    to the output of both derivation and inflection. Secondly, the larger the dataset used as

    the basis for testing theories, the greater the level of significance the results will have.

    One way of computationally extending the list to include inflected forms

    would be to compute the inflected forms (with stress) from the head-words and

    information about their stress patterns. This information can all be found in Zaliznjak

    (1977), which is available in an electronic version. A program could be written to go

    through the list of words in Oliverius (1976), matching each to the relevant entry in

    Zaliznjak (1977), and generating the appropriate inflected forms. Although it could be

    automated, even this approach would be a large undertaking, primarily because of the

    thoroughness of Zaliznjaks description: the key in Zaliznjak which explains the

    meanings of the tags to each entry takes up a significant amount of space in the

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    30/425

    28

    dictionary (132 pages). This information is not included in the electronic version, and

    it would all have somehow to be input manually if the inflected forms of all entries in

    the dictionary were to be generated computationally.

    Fortunately, however, this was unnecessary. One of the products of the

    research carried out by Brown and his colleagues at the University of Surrey (Brown,

    Corbett, Fraser, Hippisley and Timberlake 1996) is a theorem dump listing the

    inflected forms of 1,536 nouns. This file includes comprehensive information about

    word-stress, but the words are only partly morphologically tokenized (since stem-

    inflection but not stem-internal morpheme junctions are given).

    In order to ensure that all possible forms from the University of Surrey

    theorem dump were fully morphologically tokenized, each headword from the

    theorem dump was matched to headwords from Oliverius (1976) and the

    morphological tokenization of inflected forms was extrapolated from the

    morphological tokenization of the headword, by the procedure outlined in (1):

    (1) (a) For each headword (e.g. #8!&%!$;0%2! fool) in Oliverius (1976), find

    whether it is a noun by searching through the on-line version of

    Zaliznjak (1977), which provides part-of-speech information.

    (b) For each (headword) noun identified by (a), search for all relatedinflected forms in the theorem dump. For#8!&%!$;0%2! fool, these

    would be as follows:

    #8!&%) (nom./acc. pl.)#8!&%& (gen. sg.) #8!&%,( (gen. pl.)#8!&%8 (dat. sg.) #8!&%&+ (dat. pl.)#8!&%,+ (instr. sg.) #8!&%&+) (instr. pl.)#8!&%" (loc. sg.) #8!&%&: (loc. pl.)

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    31/425

    29

    (c) Pair the headword with its morphological tokenization, which is known

    from the information in Oliverius (1976) (for example, !$;0%2! would

    be paired with the tokenization !$;0ra+%2sn+in1!4), and deduce the

    noun-stem by removing the inflectional ending (in this case, zero). The

    noun-stem of!$;0%2! would thus be !$;0ra+%2sn!.

    (d) Morphologically parse the inflected forms using the parsing

    information about the stem from (c), and parsing whatever is to the

    right of the stem as the inflectional ending. In this example, the

    inflected forms would be parsed !$;0ra+%2sn+%in!, !$;0ra+%2sn+;in!,

    !$;0ra+%2sn+6+in!, etc. More detailed information on how inflectional

    endings are categorized and distinguished is given in section 3.2.1.2.

    The procedure in (1) was automated, except in the case of nouns which exhibit

    a vowel-zero alternation within the stem (such as ,%',!62/46! window [nom. sg.],

    ,%,'!/6264! windows [gen. pl.]). The morphological tokenization for these forms

    was input manually.

    As it turned out, 967 of the 2,493 words in Oliverius (1976) were nouns; 835

    of these were included in the theorem dump. Some of these nouns are identified by

    the theorem dump as having incomplete paradigms, so the number of inflected forms

    including head-words identified by step (b) of (1) was 9,633 (slightly less than 12 -

    835 = 10,020).

    4 The notation is explained fully in section 3.2.1.2.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    32/425

    30

    The morphologically parsed inflected forms were combined with the rest of

    the morphologically parsed head-words in Oliverius (1976), giving a sample of fully

    morphologically parsed words as in Table 2.

    Table 2. Analysis of words in on-line corpus

    Category Head-words or inflected forms

    In Oliverius(1976)

    In theoremdump

    Number ofwords

    Non-nouns Head-words - 1,525Nouns Head-words - 132

    Nouns Head-words 835Nouns Inflected forms - 8,798

    Total 11,290

    Regrettably, the on-line corpus of 11,290 word-forms does not include any

    inflected forms for non-nouns, which means that the results presented in this

    dissertation will have greatest weight in their applicability to nouns. But it would not

    be fair to say that this dissertation is limited in its scope to nouns, because, as can be

    seen from Table 2, the number of non-nouns is great enough that statistically

    significant results can still be achieved. When more comprehensive electronic corpora

    of Russian become available, it will certainly be interesting to see whether re-running

    some of my tests on these corpora gives results in line with those I report here;

    presumably, the null hypothesis would be that this will be the case.

    1.6 Summary

    In this chapter, I have established the approach which I employ in developing

    a computational phonology of Russian, and dealt with various issues relating to my

    perspective. To summarize, the aim in subsequent chapters is to formulate a broad-

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    33/425

    31

    coverage phonology, which is generative, context-free, coherent, and makes

    predictions that can be shown empirically to be correct, or at least a good first

    approximation at correctness. To the extent that this aim succeeds, this work will fill

    an important gap in the literature to date, as no other work of which I am aware meets

    all these criteria simultaneously.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    34/425

    32

    Chapter 2: Syllable structure

    2.1 Overview and aims

    This chapter presents a sonority-based syllable structure grammar of Russian.

    As well as aiming to advance a specific proposal about Russian, I also aim in this

    chapter to contribute to the general debate on syllabification in two ways.

    First, because the grammar is implemented as a Prolog DCG and tested for its

    coverage of a corpus of Russian words, I am able to identify a list of exceptions to the

    Sonority Sequencing Generalization (SSG), which is widely accepted in one form or

    another as the standard means of accounting for phonotactic constraints. The list of

    exceptions is comprehensive with respect to the dataset tested, so the test allows us to

    quantify precisely how problematic Russian is for the SSG.

    Secondly, we shall see further evidence that it is worthwhile to include a

    formal definition of the term syllable in a phonology, as Fudge (1969) suggests: it is

    not enough to refer to the syllable without explicitly defining it, as in Chomsky and

    Halle (1968). The syllabification grammar outlined here is put to work in a variety of

    areas of Russian phonology: it takes on a role as a structure in which to apply

    phonotactic constraints, a role familiar from Kahn (1976); it is also the structure for

    the implementation of other phonological constraints, such as assimilation, word-final

    devoicing, consonant-vowel interdependencies and vowel reduction; and, as will

    become apparent in Chapter 5, it takes on a novel role in stress assignment (novel,

    because no other treatment of Russian stress hinges on syllable structure in the way

    which I suggest).

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    35/425

    33

    To my knowledge, there are no comprehensive treatments of Russian syllable

    structure comparable to the one proposed in this chapter. Bondarko (1969) is a

    proposal, based on experimental measurements of relative durations of consonants

    and vowels in the speech chain, that all consonants (and consonant clusters) in

    Russian (except for cluster-initial !7!) syllabify together with the following vowel,

    meaning that almost all Russian syllables are open. If this is true, this would amount

    to a comprehensive proposal on Russian syllable structure, but the problem with

    Bondarkos proposal is that it says nothing about the kinds of clusters that cannot

    occur syllable-initially. In other words, the evidence that Bondarko examines excludes

    evidence about the phonotactic constraints of Russian: for example, Bondarkos

    theory does not explain why no Russian words begin with !42!. This kind of

    consideration is the starting-point of this dissertation; after all, a generative grammar

    must be able not only to assign syllable structure, but also to generate legal structures

    and rule out illegal ones. Thus the grammar I propose, contrary to Bondarko (1969),

    suggests that a number of different types of closed syllable can occur in Russian.

    The remainder of this chapter is organized as follows. Section 2.2 reviews the

    literature on syllable theory. Sections 2.3-2.4 describe a phrase-structure sonority-

    based theory about Russian syllable structure. This theory is a linear (i.e. Type 3)

    grammar, with all the advantages this brings (see section 1.3.2). However, the nature

    of the constraints needed to account for Russian syllable structure is far from obvious.

    The primary aim of the discussion in sections 2.3-2.4 is to establish what these

    constraints are, rather than debating the issue of how syllable structure is assigned. I

    then move on in section 2.5 to select four key aspects of Russian phonology which

    have attracted attention in the literature: the constraints on the consonant clusters

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    36/425

    34

    permissible in Russian, assimilation in voicing and palatalization, word-final

    devoicing and reduction of unstressed vowels. For each of these problem areas, I set

    out what appear to be the facts as generally accepted: the aim of this section is to

    show that these facts need not be seen as divorced from syllabification, but an account

    of them can be integrated into the existing PSG of Russian syllable structure. Indeed,

    in some cases, there is a clear advantage in this kind of integration. For example, the

    properties of!"! with respect to voicing assimilation are most simply explained by

    taking into account the features which the syllable structure grammar assigns to !"!.

    The result is that a single grammar fulfils a variety of functions, assigning syllable

    structure, mapping phonemic representations to phonetic representations, and, as we

    shall see in Chapter 5, acting as an indispensable component in a theory about stress

    assignment in Russian.

    2.2 The syllable in phonological theory

    The syllable is by no means a recent construct. It was discussed as a unit of

    linguistic organization in, for example, Whitney (1865), Sievers (1881), Jespersen

    (1904), de Saussure (1916), Grammont (1933), Bloomfield (1933) and Hockett

    (1955). Bloomfield, for example, states that the ups and downs ofsyllabication play

    an important part in the phonetic structure of all languages (p. 121; Bloomfields

    emphasis). It was in the 1950s and 1960s that the status of the syllable was both

    implicitly and explicitly questioned in generative phonology: implicitly, by its notable

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    37/425

    35

    absence in Halle (1959)5 and Chomsky and Halle (1968), and explicitly, in Kohler

    (1966: 346-348). As Fudge (1969: 261-262) points out:

    Chomsky and Halle (1968) continually invoke syllables, monosyllables, disyllables,etc. in their less formal discussions (in the text frequently, but sometimes also withinthe systems of rules proposed), and even postulate a feature Syllabic which wouldcharacterize all segments constituting a syllable peak (354). Unfortunately, none ofthese terms are made explicit in the text or in the rules The term syllable doesnot even figure in the index of Chomsky and Halle (1968).

    In fact, we may state that it is not satisfactory to deal with the structure of oneelement in terms of statements designed to deal with the structure of an essentiallydifferent and only indirectly related element. If we want to state syllable-structure,we must explicitly introduce the element syllable into our linguistic description, and

    state its relations to the other elements of the phonological hierarchy; it is preciselythis which Chomsky and Halle (1968) fail to do.

    From that time, partly as a reaction to Chomsky and Halles work,

    phonological theory has swung back towards endorsing the syllable. Indeed, even

    before Halle (1959), Haugen (1956: 215-216) writes of the syllable that one would be

    tempted to deny its existence, or at least its linguistic status, as some have done, were

    it not for its wide persistence as a feature of most linguistic descriptions those who

    attempt to avoid the syllable in their distributional statements are generally left with

    unmanageable or awkward masses of material. This shortcoming of Chomsky and

    Halles theory is pointed out not only by Fudge (1969), who argues that the element

    syllable should be made explicit, but also by Hooper (1972) and Vennemann (1972);

    the latter uses evidence from languages other than English to advocate the

    incorporation of syllable boundaries and syllables in phonological descriptions (p. 2).

    Perhaps the best-known work pointing out the inadequacies of Chomsky and Halle

    (1968), though, is Kahn (1976): Kahn states that in describing productive

    5 For further discussion of the absence of the syllable in Halle (1959), see section 2.2.2.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    38/425

    36

    phonological processes he was hampered by the absence of a generative theory of

    syllabification (p. 17). Kahn observed, in particular, that the phonotactic constraints

    of English could be accounted for indirectly but simply by considering syllable

    structure (pp. 40-41, 57-58). Clements and Keyser (1983), endorsing Kahns

    hierarchical analysis of the syllable, argued however that syllabicity was not a

    property of segments per se as Kahn suggested (Kahn 1976: 39), but rather involves

    the relationship between a segment and its neighbors on either side (Clements and

    Keyser 1983: 5): to account for this, they proposed analyzing syllables in terms of

    three tiers, the syllable tier and segmental tier (as in Kahn 1976) and an additional CV

    tier. Selkirk (1984) follows Clements and Keyser in rejecting [!syllabic] as a feature

    of segments.

    Despite the criticisms of certain aspects of Kahns approach, it has generally

    been acknowledged since Kahn (1976) that the syllable is an indispensable unit of

    linguistic organization. For example, a contemporary description of Slavic prosody,

    Bethin (1998), makes the following statement:

    We find that many prosodic features are restricted to or expressed on syllables, thatcertain restrictions on permissible consonant and vowel sequences are best describedas holding within a syllable, that there are phonological and morphological processeswhich seem to be conditioned by the syllable, and that many of these processes countsyllables but do not, as a rule, count phonemes or segments. (p. 192)

    It seems, therefore, that the syllable is here to stay in linguistic theory, and in

    particular that an account of syllable structure is an essential part of a generative

    phonology of Russian. One aim of this chapter, therefore, is to put forward a specific

    grammar of Russian syllable structure as part of the overall phonology proposed in

    this dissertation. This grammar is explicit about what Russian syllables are; it does

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    39/425

    37

    state its relations to the other elements of the phonological grammar, as Fudge puts

    it, and because the theory is implemented computationally and tested for its coverage,

    a standard is set against which future proposals can be measured.

    2.2.1 Sonority and syllable structure

    The notion of the syllable is inextricably linked to that of sonority, which has

    for more than a century been believed by phonologists to be an important factor in the

    structure of syllables (Whitney 1865, Sievers 1881: 159-160, Jespersen 1904: 186-

    187, de Saussure 1916: 71ff. and Grammont 1933: 98-104). Essentially, the idea is

    that segments can be categorized with respect to sonority: those that are more

    sonorous tend to stand closer to the centre of the syllable, and those that are less

    sonorous closer to the margin. Clements (1990: 284) notes that this principle

    expresses a strong cross-linguistic tendency, and represents one of the highest-order

    explanatory principles of modern phonological theory. However, there are a number

    of questions about sonority which to date have not been answered. Essentially, these

    have to do with (a) how sonority is defined, and (b) at what linguistic level sonority

    holds (Clements 1990: 287, Bethin 1998: 19-21).

    As far as the first of these is concerned, there have been various attempts at

    defining sonority. Bloomfield (1933: 120-121) equated sonority with the loudness of

    segments (the extent to which some sounds strike the ear more forcibly than others);

    another proposal is that sonority can be derived from basic binary categories, identical

    to the major class features of standard phonological theory (Selkirk 1984, Clements

    1990); and some have suggested that sonority does not have any absolute or

    consistent phonetic properties (e.g. Hooper 1976: 198, 205-206). Even ignoring the

    question of how sonority is defined phonetically, there is disagreement on what the

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    40/425

    38

    sonority hierarchy is; and until this issue is resolved, as Selkirk points out,

    discovering the phonetic correlates of sonority will be difficult. For example,

    Clements (1990: 292-296) proposes a hierarchy where obstruents are less sonorous

    than nasals, nasals less sonorous than liquids, and liquids less sonorous than glides.

    Glides, in turn, are seen as non-syllabic vowels. On the other hand, Selkirk (1984:

    112) sets out a more detailed hierarchy, as follows:

    (2) Sounds (in order of decreasing sonority)

    %1=6

    ,=;

    0

    -

    +=4

    #

    "=*=>

    ?=@

    A=$=B9=&=2

    Whatever the exact classification of sounds by sonority, it seems to be a

    general rule that for each peak in sonority in a string of phonemes, there will be a

    syllable (Bloomfield 1933, Selkirk 1984, Clements 1990). Perhaps the best-known

    formulation of the sonority principle is Selkirks (1984: 116) Sonority Sequencing

    Generalization (SSG):

    In any syllable, there is a segment constituting a sonority peak that is preceded and/orfollowed by a sequence of segments with progressively decreasing sonority values.

  • 7/30/2019 A computational phonology of Russian - Peter A. Chew.pdf

    41/425

    39

    In this formulation, the syllabicity of segments depends on their position,

    rather than on any inherent phonological property of their own (Clements and Keyser

    1983: 4-5, Selkirk 1984: 108, Blevins 1995, Bethin 1998): sonority peaks simply

    align with syllable peaks. This offers an explanation in terms of syllable structure for

    the fact that glides and approximants can function as either consonants or vowels, a

    fact that was noted as early as Siev