crystal 01

20
I Some prefiminaigf considerations 1.1 The need for information While there has been a rapid development in the study of English speech during this century, particularly stimulated by the advent of an autono- mous and scientifically orientated discipline of Linguistics, it is none the less very plain from looking at the theoretical and practical articles and handbooks dealing with speech that there are certain areas of study which have received either superficial treatment by scholars or no treatment at all. Despite much excellent work that has been done on such' topics as intonation and stress in English (reviewed below in chapter 2), there are still aspects of both-the semantics of intonation, for example-which, while well appreciated as being of fundamental interest, have received little investigation. There is also a marked lack of information about the total range of vocal phenomena that are linguistically relevant for the study of a language. There has been little attempt to delirnit, and systematically describe and classify this whole range of vocal effect within one theoretical framework, and the bibliography is equally sparse on the correlation of intonation and other features with more Well- recognised kinds of linguistic organisation, particularly syntax. To study the reasons for this inadequacy is instructive, and I shall discuss some below. Meanwhile, it suffices to say that most of what I shall be calling the ‘prosodic features’ of English have not been described in earlier phonological or semantic studies of speech; many have been unnoticed as displaying any degree of systematicness comparable to that normally noted in other parts of language; and a few have been deliberately excluded from the legitimate field of linguistic study (cf. 4.16). And while a clear boundary-line between linguistic and non-linguistic in this field is often difficult to draw (cf. 3.11 and 4.16), a case can and should be made for the inclusion of more under such a heading as ‘the linguistic contrasts available in English speech’ than has hitherto been allowed. The descriptive inadequacy is of course refiected and intensified in the teaching situation, where it is only recently that the linguistic im- portance of even such features as intonation has been more than super- I CPS

Upload: renealain6495

Post on 17-Jan-2016

224 views

Category:

Documents


0 download

DESCRIPTION

Cap12

TRANSCRIPT

Page 1: Crystal 01

I Some prefiminaigf considerations

1.1 The need for informationWhile there has been a rapid development in the study of English speechduring this century, particularly stimulated by the advent of an autono-mous and scientifically orientated discipline of Linguistics, it is none theless very plain from looking at the theoretical and practical articles andhandbooks dealing with speech that there are certain areas of study whichhave received either superficial treatment by scholars or no treatment atall. Despite much excellent work that has been done on such' topics asintonation and stress in English (reviewed below in chapter 2), there arestill aspects of both-the semantics of intonation, for example-which,while well appreciated as being of fundamental interest, have receivedlittle investigation. There is also a marked lack of information about thetotal range of vocal phenomena that are linguistically relevant for thestudy of a language. There has been little attempt to delirnit, andsystematically describe and classify this whole range of vocal effectwithin one theoretical framework, and the bibliography is equally sparseon the correlation of intonation and other features with more Well-

recognised kinds of linguistic organisation, particularly syntax. To studythe reasons for this inadequacy is instructive, and I shall discuss somebelow. Meanwhile, it suffices to say that most of what I shall be callingthe ‘prosodic features’ of English have not been described in earlierphonological or semantic studies of speech; many have been unnoticedas displaying any degree of systematicness comparable to that normallynoted in other parts of language; and a few have been deliberatelyexcluded from the legitimate field of linguistic study (cf. 4.16). Andwhile a clear boundary-line between linguistic and non-linguistic in thisfield is often difficult to draw (cf. 3.11 and 4.16), a case can and shouldbe made for the inclusion of more under such a heading as ‘the linguisticcontrasts available in English speech’ than has hitherto been allowed.

The descriptive inadequacy is of course refiected and intensified inthe teaching situation, where it is only recently that the linguistic im-portance of even such features as intonation has been more than super-

I CPS

Page 2: Crystal 01

2 1.1 Some preliminary considerations

ficially realised. It is still rare to find a general guide to English speech(as opposed to a _specific study of intonation) which pays close andsystematic attention to even the majority of variables within the ‘non-verba1’_or ‘suprasegmental’ aspects of spoken language, or a grammarwhich has a section on intonation and stress near the beginning-or any-where. Again, the absence of any Well-defined theory and procedures ofanalysis has resulted in distortions and vague conceptual terminology inmany of the textbooks which purport to be introductions to intonationand related features in English. This is indeed a paradoxical situation forEnglish language pedagogy to find itself in, as it is a phonetic ‘residue’of imperfectly learnt prosodic features which is usually the final barrierto the mastery of a foreign language, by maintaining a stubborn accent,on the one hand, and by obscuring the full range of attitudinal contrastswhich prosodic contrasts indicate, on the other. Before such a situationcan ever hope to be improved, however, it is necessary to evaluateathealready available information about these features, and to supplementit by a large-scale, systematic survey of all the variables made use of inEnglish speech. The conclusions presented in this book are based on ananalysis of a large sample of English collected for this purpose.

It is understandable that the study of intonation and related featuresshould be in such a state, when one considers the difficulties involved insubjecting this aspect of language to analysis-problems of obtainingreliable information, of defining the range of variables affecting anysemantic interpretation, and of identifying and measuring such elusivephenomena as pitch (see 1.3, z.1o.1, 3.7 and 7.2). But by far the mostimportant reason for current inadequacies is a historical one: as willbecome clear, from chapters 2 and 5, the demands of English-languageteaching in the early decades of this century produced partial descriptionswhich, in the absence of sufficient theoretical and descriptive research,regularly involved oversimplification and, misinterpretation. Traditionsof study were established which have only recently begun to be criticallyexamined, and the extent to which certain aspects of the subject havebeen neglected is only now being realised.. Misleading, impressionisticstatements about specific intonation patterns are the most obvious andwidespread result of this neglect-a similar state of affairs to that foundin English grammar over the past three or four decades (though theenlightenment which has percolated through grammatical descriptiondoes not seem to have had any effect on intonational study). Such state-ments usually imply the existence of statistical support (through the use

Page 3: Crystal 01

1.1 The neea' for information 3

of such terms as ‘normally’), but in fact would seem to be based largelyor totally on an unscientilic impressionism that all too often results inoversimplification of a complex linguistic situation, or in partial truth-Which, as is well known, the student-teacher too readily generalises. Theobservation made by Coleman as long ago as 1914 could be made withequal validity today:

investigations hitherto have been deprived of much of their value through theinquirers’ not going far enough afield (thus building up a theory on the fewobvious examples that occur to one at the moment) or through their takingtheir examples from the connected language of books where the sentences areoften altogether wanting in the variety found in conversational speech (p. 19).

A typical example of the kind of general statement referred to is to befound in most descriptions of the intonation of questions in English.A major distinction is made between the intonation of ‘particular’questions (those beginning with an interrogative word such as ‘how’)and ‘general’ questions (those arising from inversion of the subject andfinite verb): the former are said to be pronounced ‘normally’ with a

falling tone, the latter with a rising tone. Statements of this kind, Withoutany further descriptive qualification (as to what is meant by ‘normally’,for example) seem to go back at least to Butler (1633), and may be foundin almost every exposition of English intonation that has been written?Analysis of most varieties of English speech, however, shows that theissue is hardly as simple as this, it being quite possible to have both afalling and rising tone with each kind of question, the difference lyingin the type of attitude involved.

What are you doing?

0 , ‘

o\

is generally more serious and abrupt in its implications-at least forBritish English--than the more friendly and interested

What are you doing?

o _ _

Q/

‘ See, for example, Sweet (1890, p. 32), Palmer (1922, p. 73; 1924, §§439, 444),Armstrong & Ward (1926, pp. Io, 21), Palmer & Blandford (1927, p. 3), _Iespersen(1933 b, §28.6), Bloomfield (1933, pp. 171 ff.), Coustenoble 81 Armstrong (1934, p. 14),Harris (1944), Allen (1954, p. 43), O’Connor (1954, p. 91), Jones (1956a, p. 228),Kingdon (1958a, p. 210), and Stockwell (196oa). Cf. also Fries (1964, p. 245) for apartial review of this issue.

I°2

Page 4: Crystal 01

4 1.1 Some preliminary considerations

and similarly with

Are we going out this evening?

IQ ’ _

‘\ . Q ¢

compared withAre we going out this evening?

Q n . ' ' . 'i 9

-all things (for example stress, speed of utterance, facial expression)being equal. It is realistic to generalise only if contextual information isprovided and supported by statistics. As Fries says, in the preliminariesto reporting an experiment on this subject (1964, p. 245):

General impressions without the support of any actual counting of theinstances in even a small corpus provide no satisfactory ground for generalisingconcerning comparative frequency of occurrence. Unfortunately, so far as theevidence goes, the many assertions concerning the rising intonation as the usualmark ofyes-no questions in English have not been based on any adequate bodyof quantitative information.

Fries’s own analysis is stylistically too restricted to provide results of anygeneralisability (all the questions he examined were used in the contextof a quiz-programme), but his findings (that yes-no questions used afalling tone approximately 62 per cent of the time) are not withoutsignificance, and an examination of the distribution of tones on suchquestions in any kind of conversational English shows a similarflexibility.Clearly a more complex descriptive statement is suggested, to allowamongst other things for attitudinal and stylistic variables. Occasionally,one does iind more satisfactory attempts at description and prescription--Huang 86 Green (1964), for example-but the enlightenment whichColeman showed (1914, zo) in his critique of ]espersen’s position inthis matter is unfortunately absent from the most widely used textbooks.

The intonation of questions is but one Held which has been badlytreated; there are many others. It is the presence of such misleadinginformation, plus the absence of any synchronic description of the fullrange of non-segmental vocal effect in English, which provide the twomain reasons supporting the need for fresh research into a familiar area.In this book, I hope to be able to carry out two tasks: first, to outlineand justify a theoretical framework which will define and interrelate allthe non¥segmental contrasts which exist in English, and which will allow

Page 5: Crystal 01

1.1 The need for information 5

for variation in depth of descriptive interest as Well as for differences in

the kind of phenomena involved; second, to select certain aspects of thisfield for more detailed analysis and discussion, on the grounds' that they

have been particularly neglected, or that they are aspects about which

there is a need for fresh thinking. I shall not have solutions for all the

problems which emerge, but it should at least be possible to highlightthe difficulties and suggest more precise ways of talking about them.

1.2 The scope of ‘prosodic’

A definition of ‘prosodic systems’ will only really emerge during the

course of this book, as it Will arise out of the study of the features thatwould be subsumed under the term (see especially chapter 4). Therespectable ancestry of the Word ‘prosodic’ will be given in chapter 2.

Meanwhile, it Will be helpful to give an indication of what is going to be

involved in studying this field, so as to provide some perspective for thediscussion of procedures which follows.

There could be both negative and positive ways of approaching a

definition. From the negative point of View, one might say that Within theact of speech, there are aspects of language structure which Would be

outside the scope of a formal prosodic analysis: grammar, vocabulary,and segmental phonologyf If one could imagine these aspects removedfrom speech, the systems of linguistic contrasts in the non-segmental

‘residue of utterance’ would be the subject-matter of prosodic analysis,

in my sense! More precisely and positively, we may define prosodic

systems as sets of mutually deffinin phonologigalmgfeaturesgvyhichgghave

an essentially variable relationship to thevyyygordsgsegzptged, as opposed to hé"i(§e§mental) phoneihes, the lexical

meaning) which have a direct and identifying relationship to such Words.

For this book the primary prosodic paravmggtygrsg along which systems of

linguistically contra canWbe plotted, are the p§y5;_h,ol,o,gical

attributes of sound described below as itch loudness and duration.,,...,i_,.._,.t,t...t..,t........-.~.»~» ,B.._1 WW... wwe:

which have a primary (but not an identifying) relationship xvith the”

1 I am taking the inventory of phonemes and syllables in English as ‘ given ’, as is usual

in research in this area. In fact, the phonernic analysis presupposed is that of Gimson(1962); issues of syllable division, etc., are referable to O’Connor & Trim (1953) Who,

it is worth noting, make a complementary deliberate omission: ‘We have taken noaccount in our work of prosodic features’ (p. 105)-though ‘prosodic’ here is notbeing used in precisely my sense.

2 For the term ‘residue of utterance’, see Hultzén (1964, p. QS); cf. also Hockett’s‘macrosegrnent’, which has two immediate constituents, an intonation and a

remainder (195 5, p. 44).

Page 6: Crystal 01

6 1.2 Some preliminary considerations

physical dimensions of fundamental frequency, amplitude, and timerespectively (see chapter 3). ‘ Intonation’, for example, is viewed as theproduct of a conflation of diiferent prosodic systems of pitch contrasts;‘stress’ is referable to variations in the loudness parameter. Otherprosodic systems comprise independently varying vocal effects based oncombinations of these three parameters in specific ways (for example,rhythmicality), or on contrasts in silence (the system of silent pause).Qther vocal effects, similar in their general relation to the segmental sideof language and ini iseiii`afritic°i~'iile“,`but”`distifict'iiiftheir physiological

aff39P1af°fY__,, l2§s1S,_.a¥3Sl i»,Wfestrléafion, I shall, be 9§lll13£,.P”i7f{ii7lé’iiifilf_featu`r'é`§fWThe distinction between prosodic and paralinguistic is dis-cussed in detail in 4.3, and other, similar uses of these terms are referredto in chapter 2.

I considered and rejected a number of alternative terminologies to theuse of ‘prosodic’: ‘suprasegmental’, for example, was unsatisfactory, asit carried too dominantly the implications of a specific linguistic theoryand method which is inadequate (see 5.2), and also because the prefix‘supra-’ implies a priority of segmental over non-segmental linguisticfeatures which is linguistically suspect (cf. the argument of 4.16). Theterm ‘tone of voice’ was also considered as an alternative, but while thishad the virtue of familiarity, it had the corresponding vice of vagueness,on account of its popular usage and a multiplicity of senses which coveredlinguistic as well as non-linguistic components of utterance.

1.3 Analytic proceduresIt is important to state the main principles and procedures underlyingmy analysis, not only for internal clarification, but because certain of theissues involved are _somewhat controversial, in particular what is meantby a linguistic ‘analysisf, the reasons for using a corpus, and whichdescriptive techniques are most reliable and useful. Concentration onthe ‘how’ or the ‘Whence’ of description must not of course be allowedto sidctrack the linguist for too long from his primary descriptive pur-pose: one could easily spend a whole book restricting oneself to a dis-cussion and evaluation of discovery procedures (see the valuable surveyin Samarin (1967), for example). Some knovvledge of this kind, however,is a useful preliminary to evaluating any analysis: I will therefore brieflyoutline the points of major methodological interest, before going on tothe principles and results of the description itself.

First, it is important to point out that the term ‘a.ualysis’ has been

Page 7: Crystal 01

1.3 Anab/tic procedures 7

used in a number of different senses early work in this field, some ofwhich are confusing if not vacuous. This can be seen if one examinessuch common uses as the following, the term being employed differentlyin each case: ‘ auditory analysis ’, ‘ acoustic analysis’, ‘ analysis of thedata’ (i.e. to find out what the linguistic contrasts are, presumably),‘ analysis ’ in a statistical sense (cf. p. Io), and ‘ an analysis’ (in the senseof ‘structural description’). The latter is the sense in which I am usingthe term: -by ‘analysis’ I am referring to the explication of the non-segmental contrasts perceived in my data as meaningful (in the sense ofp. I9 below) by postulating a set of prosodic systems within which theymay be defined and interrelated. I am not referring to the process ofrecognition which produces our awareness of such contrasts in the firstplace (so that it is not strictly meaningful to talk of ‘ analysing’ the data,without much qualification); nor am I referring to the different tech-niques which are available in order to reduce one’s data to a form moreamenable for analysis (so that I shall not talk of auditory/acoustic/articulatory analysis, but techniques).

Secondly, the speech data which provided verification for the analysiswere selected to cover a range of educated English, described below, andgathered together as a corpus. The need for such an independent bodyof data was clear. In a field which deals so closely with personal attitudes,often imprecisely definable by introspection, and difficult to measure andassess, it was not possible or desirable to rely on my own impressions ofusage to determine either the form or the function of prosodic features,nor was it possible to determine frequential information in this way. Ashas been frequently recognised, this is particularly the case when theinvestigator is a trainedphonetician: ‘those who have been trained to a

conscious control of their intonation patterns cannot provide the body ofspontaneous utterances from which to discover exactly what the patternsare and the relative frequency of their use’ (Fries, 1964, p. 245). Apartfrom this, analysis based on the speech-acts of a single person, such asmyself, would be unsatisfactory for two reasons: first, I am not certainabout my usage of prosodic features in all respects; second, I would findit impossible to say where the boundary line should be drawn betweenthe culturally determined, conventional, linguistically significant featuresof speech, and the physiologically determined, individual and linguisti-cally uninteresting features (see 3.2, where this distinction is amplified),and my descriptive statements would consequently be ‘ skewed’ bybeing focused too closely on the idiosyncratic. Reference to information

Page 8: Crystal 01

8 1.3 Some preliminary considerations

derived from native speakers other than myself, then, was essential as acheck on and a supplement to my own intuition (in no sense is the corpusa replacement of this intuition, as is sometimes naively suggested), and asa source of suggestions about the internal organisation of language whichmight have been ignored or overlooked if I had relied_solely on intro-spection.

The actual constitution of the corpus is outlined below (p. 12). Beingby dehnition finite, and incomplete in its attempt to be a representativesample of spoken English, it occasionally needed to be supplemented byinformation on specific questions. In such cases, the policy used herewas to refer to the intuitionsof other native speakers (informants), Who

were as linguistically naive as possible, by obtaining their reaction aboutlinguistic events in Well-controlled experimental situations. Such tests,relating as they did to areas in _Which the corpus and my own intuitionWere felt to be inadequate, Were not planned in advance as a series oftests, but Were rather introduced as the need for them became obvious.When used in the present research, such ad hoc testing is referred to asit arises (see 5.3, 7.3). Clearly, if carefully carried out, quite a largeamount of useful information about a specific problem can be accumu-lated fairly rapidly.

The aim of the research, then, was to study the performance (acts ofspeaking, or utterances) of a number of speakers, gathered together intoa corpus, or elicited in informant-reaction tests, as evidence for thedefinition of the conventional, non-idiosyncratic linguistic system whichunderlies these utterances. In doing this it is important to stress theconsideration, already suggested, that an approach which uses a corpusis ‘ corpus-based ’ in only a Weak sense of that term: it is not synonymouswith ‘ corpus-restricted ’. To go beyond the corpus is envisaged from thevery' outset, as ultimately one wants to make statements about thelanguage system as a whole-or, to put it in current generative terms,about the underlying ‘competence’ which linguistic performance is

supposed to reiiect: ‘The problem for the linguist. . .is to determinefrom the data of performance the underlying system of rules that hasbeen mastered by the speaker-hearer and that he puts to use in actualperformance’ (Chomsky, 1965, p. 4; cf. also Chomsky Sc Halle, 1965,p. IO3). To restrict one’s attention wholly to the utterances of one’scorpus, even though they may show interesting and unexpected patterns,is valuable for such people as the psychiatrist or stylistician, but it is

over-limiting for the general linguist, who rather uses this material as a

Page 9: Crystal 01

1.3 Analytic procedures 9

relatively more objective, public and accessible starting-off point than isotherwise available.

The concentration on performance in many parts of this book istherefore a procedural requirement, and (from the point of view oflinguistic theory) a limited aim; but it is none the less a prerequisite inthis field, in view of the indeterminacy and intuitive uncertainty aboutthe facts of intonation which the linguist finds. This is an importantpoint, which was only hinted at above. It is not the case that a nativespeaker has conscious tacit knowledge of all the patterns and regularitiesthat constitute English-certainly not in the field of prosodic features(cf. Chomsky, 1965, p. 8). Reliance on intuition is justified until onemeets with conflicting intuitions about what is the case, and phenomenalike intonation produce queries on every front (though no-one therebydoubts the underlying systemicness of intonation). This in my View isthe main justification for any statistical work on language: it shouldproduce information about the facts of English which is unexpected,non-trivial. Of course only such facts will be accepted as linguisticallysignificant as receive a posteriori intuitive ratification, otherwise onewould let in distributions reflecting all kinds of irrelevant phenomena (cf.the problems of acoustic techniques, p. I 3). But the function of intuitionhere is not as a basis of analysis or classification. The statistics displaya distribution or pattern which, upon reflection, we realise underlies ourunconscious performance. (They may also, of course, confirm _our pre-viously held tacit knowledge of patterns, in which case their use wouldbe more trivial; but, as mentioned, our certain tacit knowledge ofprosodic features in English is very slight.) In other words, statistics arenot brought in at random; they are only introduced at points where thedata display no obvious pattern, but reason-and intuition tell us thatthere must be some systemicness present, otherwise the conventionalcommunicative value of prosodic features would break down.

The function of the statistics which lie behind many of the statementsin the linguistic description in this book is therefore limited, butessential: it is a supporting and clarificatory role, and not usually anexplanatory or evaluative one; it normally veriies (or fails to verify)hypotheses reached on intuitive grounds, but it may sometimes act as astimulus for the formulation of new hypotheses about language. But thepotential relevance of statistics has rarely been given explicit recognitionin linguistic research (though cf. Classe, 1939, Herdan, 1966), being butobliquely referred to through the use of inspecific adverbials of frequency,

Page 10: Crystal 01

10 1.3 Some preliminary considerations

as mentioned above, or through the introduction of a (undefined) norm

of some kind-see, for example, Smits Van Waesberghe (1957, p. 374),

Classe (1939, pp. 3647), or Chreist (1964, p. 45).¥ One must not of course

be distracted by the quantitative side of the exercise: non-quantitativeinformation on procedures is required at three separate points-in the

preliminary linguistic analysis, inthe initial sampling, and of course in

the interpretation of the statistical results. The basic point to be

remembered by the linguist embarking on research in this field is that

statistical analysis must itself be dependent on a certain minimum of

prior established qualitative analysis of some kind: obviously one has to

have something to count at the beginning, to have some structural unitsdefined, and moreover to be fairly sure that what is to be counted is in

fact worth counting (cf. Reed, 1949, pp. 23 5 ff.); and such issues are

decided independently of the actual choice and utilisation of statistical

techniques. '

Problems of sampling are also a separate issue, and are bound up with

the nature of the statistical model one is using. The main principle of

any such model is that language is viewed as a class of events grouped intocategories, each event having a probability of occurrence. This probability

is readily definable (cf. Newman, 1957, Connolly Sz Sluckin, 1962) as the

ratio of the number of times an event in a category does occur to the

number of times it might have occurred, i.e. to the total number of

possible events. It is axiomatic in this approach that ‘The most likely

value of a proportion in a total set, that is, of the true probability, is the

value observed in a limited sample’ (Newman, 1957, p. 1 19). But this is

only applicable if it is the case that the initial sampling is adequate, and

is as random a sample as possible of the total linguistic population which

one is trying to define (see below).

Because of the existence of widely divergent linguistic varieties of

spoken English (formal religious monologue alongside informal sports

commentary, for instance), it is clearly impossible to achieve any such

degree of generality as ‘a sample representative of the language as a

Whole’. Instead of truly random sampling' (probably impossible in

language study-cf. Newman’s scepticism (19 57, p. I22)), the linguist

must begin by making a controlled sample of utterances, choosing ,texts

for analysis Within a relatively restricted and clearly defined ,range of

I This concern for norms is reflected in the comparable optimism of other branches of

linguistics, such as stylistics; see Gregory 8; Spencer (1964, p. I02), Quirk (1961 a,pp. 216 ffl). ‘

Page 11: Crystal 01

1.3 Anabftic procedures 11

usage, which is intuitively homogeneous as far"as linguistic featuresother than those he is studying are concerned. Thus, before beginning tostudy prosodic features, one would choose a corpus of utterances ashomogeneous as possible from the point of view of regional dialect,register, formality of discourse, and so on in respect of segmentalphonetic characteristics, grammar, and vocabulary (for a discussion ofthe relevant stylistic categories involved, see Crystal & Davy, 1969). Theinvestigator is here concerned simply with minimising the most in-fluential variables on the form of the utterance. He need not be tooanxious to obtain absolute consistency in this matter. Apart from thefact that this will be impossible in any case in view of the ever-presentidiosyncratic or person-identifying elements in utterance, there is alsothe point that statistically valid reasoning may be made Without suchinitial total consistency. As Reed says (1949, p. 246): ‘To pursue apractical working method, the quantitative linguistic investigator mustcollect his samples of evidence from a variety of sources within the areaof the language that he is investigating. In this way, he trusts that thosefactors which may tend to prejudice the results of his study will cancelone another out’ (my italics). Having said this, however, it must bepointed out that within the stylistic and regional limitations of my corpusas it stands (see below), the selection of the different speakers wascarried out on a random basis; that is, it was judged that there was asmuch likelihood of utterances of speaker A in text X occurring in thesituation* of text Y, and so on.

In order to attain a reasonable degree of linguistic homogeneity, then,the speakers whose utterances were analysed were all chosen from a'

single dialect which I have called, following other uses of the label,educated British English (cf. Iassem, 1949, Dietrich, 1956, Quirk, 1957,1960, Nyqvist, 1962); the range of accents used would generally beclassed together as kinds of Received Pronunciation (see Abercrombie,-1956, Gimson, 1962). This choice was directed, not of course by anyconcept of a linguistic betterness within this dialect and accent, but bythe greater usefulness of research based upon them, as opposed to anyother British regional or class dialect or accent: this variety is the one

‘ For the notion of ‘situation ’, see Crystal & Davy (1969, chap. 1). It should also beclear that I am only concerned in this book with the direct representation of prosodicfeatures in the spoken medium: I am not concerned with the reflection of thesecontrasts in conventional orthography (such as the Way in which authors attempt toreproduce the effect of different ‘tones of voice’ in their work) or in any othermedium-these are related but logically later subjects of study.

Page 12: Crystal 01

12 1.3 Some preliminary considerations

upon which most research has already been done, and the basis of the

majority of textbooks available, hence its use will facilitate the correla-

tion of my results with already familiar information; also, it is the variety

which is used by most of the socially influential section of society, and

hence a ‘prestige’ variety for the majority of educated native speakers

and foreign students. Finally, it is the most suitable variety to set up as a

realistic norm for British_English because it is the most neutral from the

point of view of indicating region: it is geographically unmarked (cf.

Abercrombie, 1956, p. 44).

The term ‘educated’ is introduced to indicate more clearly the

boundaries of the corpus, to label an intuitively recognisable internal

coherence. It is only superficially circular to say that educated English

refers to the kind of English used by educated people: if necessary, one

could define what standards enter into a social expectation of educatabi-

lity, and what linguistic values are involved. In the present case, all

speakers had university degrees. The linguistic function of the label

‘educated’ is simply to draw a distinction between the kind of English

being analysed and the extremes of sub-standard and regional dialectisms

which would not be acceptable to the majority of educated society in

serious situations.The primary data for the analysis consisted of approximately 3o,ooo

words (about 3 hours) of various informal discussions and conversations.

There were 30 speakers in all, the majority being male and middle-aged.

None of the participants in any conversational situation were meeting

each other for the iirst time. In addition, material was collected repre-senting other major spoken varieties of English (including radio talks and

news, sports commentary, television advertising, sermons, speeches,

lecturing, story-telling), in order to see the extent to which linguistically

significant but situationally restricted non-segmental contrasts were in

operation. Analysis of this further material is still continuing: a process

of verification of the system’s adequacy by applying it to new data must

go on until all patterns seem to have been defined and classified.

Situationally restricted prosodic or paralinguistic contrasts will be

referred to as they arise; otherwise, all references to ‘the data’ should

be construed as meaning the primary corpus of informal discussion.

A period of three hours of connected speech is an adequate sample for

present purposes: it is sufliciently long to be representative of the whole

range of prosodic features outlined in chapter 4 (though not all the

paralinguistic ones), but not overlong. The deciding matter in this

Page 13: Crystal 01

1.3 _/lnalgvtic procedures 13

question of length is practicality, in terms of the linguist’s time. Tran-scribing the speech accurately to account for all the linguistic contrastswas a task which took well over a year-and this excludes the checkingof the transcription by two other linguists. According to Pittenger (1963,p. 142), it took Hockett 25-30 hours to do the first five minutes ofpsychiatric interviews for The jirst jive minutes-very rapid going.Clearly, the justiliable demands for a larger corpus can only be answeredby pointing to the practical diiliculties involved.

A further point which has to be raised in any discussion of proceduresis the question of which descriptive technique in general phonetics ismost appropriate for the study of non-segmental vocal effect. Bydescriptive techniques I mean the methods one may use in order torecognise, delimit and notate the linguistic contrasts in the speech-data.Of three such techniques available in general phonetics, two are clearlypotentially usable, the acoustic technique and the auditory (cf. Cowan,1962, pp. 567 f.). The third, the articulatory technique, is not suflicientlyrefined to be useful in describing the minute matters arising fromvariation in pitch, loudness and tempo, and is only used in relation tothe study of paralinguistic features (see below, chapter 4). The importantquestion which had to be decided at the outset of this research waswhether the present study would benefit most from either acoustic orauditory techniques, or whether a combination of both was required.

The realisation that there are two sides to the understanding of thefact of speech-the physical and the psychological-and two maintechniques of recognition which complement rather than alternate witheach other, has led many linguists and phoneticians to criticise anyapproach which would be based on only one or the other; and thepopularity-to some extent due to the novelty-of purely instrumentaltechniques of investigation has begun to diminish. But even withsimpler, speedier, cheaper, more sensitive and reliable instruments toanalyse a corpus of speech than exist at present, the results would stillhe of limited value in trying to reach any understanding of the meaningof such vocal effects as intonation and other prosodic features whenperceived by the listener, for the obvious reason that the instrumentalanalyses Pfodsuce p§; ki§,1;,sa5§1a9,§9§§ivs Qréetail 19 QeanpaUem. The physical correlates of those features, oraccumulations of features, which are of linguistic significance for thenative speaker, are obscured by the presence of a large amount ofaccompanying but less relevant (or irrelevant) phenomena. Danes (19 57,

Page 14: Crystal 01

14 1.3 Some preliminary considerations

pp. 4o~1; cf. 1960, p. 37) summarises the dangers of the instrumentalapproach and then suggests the alternative:

Granting that instruments are more accurate and sensitive than the humanear, it is also evident that instrumental records and their interpretations in termsof physical acoustics do not give us a true picture of the way in which the speak-ers hear and understand (evaluate) their own language. . . The significance andfunction of the various Waves, formants, etc.-these must first be discovered, if

only in outline, by an auditory analysis [sic] of spoken language.

This conclusion is dominant in the literature on this subject, as the

following references should make clear: Pike (1945, pp. I4 lf), ]assem

(1952b, pp. 17-18), Schubiger (1958, p. 2), Hadding-Koch (1961), etc.

A second, practical diliiculty involved in making an analysis of con-

nected speech which ultimately refers to the acoustic level follows on

from this: namely, that the degree of scrutiny required to identify and

classify the acoustic correlates of the linguistically relevant features

would take up so much time as to limit the amount of utterance under-going study very severely indeed. It is indeed valuable to take the first

five minutes of conversation, and to analyse them with close referenceto phonetic correlates (a major operation even when acoustics is not

involved, cf. Pittenger, Hockett 85 Danehy, 1960), but this does not get

the linguist as far as he would like to go. He wants to find out whathappens in the next five minutes, and the five after that-particularlywhen he is examining linguistic contrasts which are discontinuous andrelatively infrequent. His corpus of material must extend to many

thousands of words if he wishes to make descriptive statements aboutEnglish prosodic features which have any validity. Thus he has to iinda middle way, a compromise between an over-narrow acoustic phonetic

method of identification and transcription, and over-general statementswhich take in a great quantity of material but which still leave the basic

problems of classification untouched.The acoustic technique on its own, therefore, While providing the

linguist with readily quantifiable data, and a valid point of relatively

objective reference, none the less restricts him 'too much. He does betterby beginning with a strictly auditory technique, referring any queries

(such as whether a pitch glide is rising or falling) to an acoustic record,

whenever this might help in providing some explanation for the

difficulty, and trying to establish the relationship between acoustic/

articulatoryand auditory patterns, as far as this is possible. The term‘ auditory’ is not however particularly clear, and a comment on its sense

Page 15: Crystal 01

1.3 Analytic procedures 15

in this book is necessary. It can be interpreted in one of two ways. Themost general sense is in the context of the psychology of perception (andalso in general phonetics, whenever this is truly ‘ general’, i.e. notphonologically orientated experimentation), where it refers to thepsychological ability to perceive sound (or some attribute of sound, seechapter 3) through aural mediation only-one talks of ‘auditory sensa-tion ’, for example. I shall use ‘auditory ’ in this sense only in the discus-sion of sound attributes (pp. Ior lf). Distinct from this, we have the useof ‘auditory’ to imply interpretation rather than sensation, and this isthe senseused throughout the rest of the book: the ‘ auditory technique ’

refers to an interpretative method of establishing the range of linguisticcontrasts (and of course what is non-linguistic) in data prior to analysis,Which is based on an intuitive assessment of what can be heard there (asopposed to what can be seen and heard, as in acoustics, or seen, heardand felt, as in the articulatory technique).

Two points should be noted in' this connection. First, the distinctionbetween auditory sensation and interpretation is a result of there beingtwo independently orientated models of description in this field (thatused in the psychology of perception and that used in linguistics), anddoes not imply that one must recognise two separate perceptual stages ina linguistic theory of performance. Secondly, it is no longer felt thatauditory interpretations of speech must befallible, ‘ mere ’ impressionism.Technological advances in recording techniques have allowed, the com-munication situation to be studied more thoroughly and easily. Materialcan now be obtained in large amounts and subjected to highly detailedexamination via the tape recorder and camera, and both linguistic andkinesic analysis have benefited. The occurrence of phenomena can beobserved at will by linguists working independently to begin with, then,at a later stage of investigation, in collaboration; so that by dint ofrepetition the contrasts in the material will be identified as consistentlyand accurately as possible, and descriptive statements of high reliabilitymade after discussion ‘over the table’. In those cases where differencesof opinion are too great to be resolved, at least the point at issue can beclearly stated, the different points of view tabulated and prepared forscrutiny by others. This may itself be important for delimiting areas ofindeterminacy in language: query decisions very often form a pattern!

‘ This was the method used in describing the corpus for this book. The transcriptionwas checked by two other linguists, points of ‘dispute (over which prosodic contrastwas involved) being allowed to stand, and noted as such.

Page 16: Crystal 01

16 1.3 Some preliminary considerations

lVlcQuoWn (1957) concludes that such a high level of agreement

would in fact be reached by any six different analysts Working in any

branch of linguistics. He suggests three principles on which to Work:

total accountability (‘ everything on the tape must be categorised analyti-

cally and adequately rendered by the symbology’), replicability (‘ investi-

gators can listen to the same tape, and, Within the limits of human error,

apply the same analytic categories (and their corresponding symbology)

and come out with the same transcription, save for minor differences

made inevitable by the elasticity of the analytic systems’), and ‘varni-

ability (‘Where there are differences in the transcription. . .investigators

can refer to a particular symbol representing a particular sound-

configuration on the tape and can. . .iron out their differences’). This

may be an optimistic View of the perfectibility of the phonetician. There

always seem to remain, even in the best descriptions, genuine points

of unresolved query as to which of a number of linguistic contrasts is

being used, that it would be unscientiiic to obscure. Anyone who has

Worked on the transcription of intonation knows that, in a dispute as to

what is the case on the tape, one linguist (X) might well be able to con-

vince another (Y) that his (X’s) interpretation was correct, but it is

doubtful Whether Y’s conviction has any great descriptive cogency:

usually a fresh listening on a later occasion reproduces the original

division of opinion. It would seem to be this interpretative conflict which

is the important linguistic issue, and which should be recorded in the

transcription-particularly in view of the fact that agreement on the

majority of points of interpretation is relatively easily obtainable.

lVlcQuown’s procedure, With this rider in mind, seems a very reasonable

and practical one. It is just this possibility of retracing the steps and

regrouping the linguistic counters which supplies the auditory tech-

nique with an objective footing.

Using controlled auditory techniques, with occasional reference to

acoustic correlates, then, the first requirement of this research Was to

reduce the corpus of speech data to a form at once more permanent and

available for analysis, quantification, and the making of comparative

judgements. There were of course a number of notations available, for

example using numerical scales (Coleman, 1914); numerical levels

(Pike, 1945, Trager Sc Smith, 1951); indications of fundamental fre-

quency (lassem, 1952 b); linear staves with dots (jones, I9 56b;

MacCarthy, 1956), dashes (Coustenoble 8: Armstrong, 1934), or both

(Armstrong 85 Ward, 1926, Kingdom, i958a); lines within the text

Page 17: Crystal 01

1.3 Analytic procedures 17

(Fries, 1952, Bolinger, 1958a); musical notation =(Fonagy 8: Magdics,1963); indications of tonetic stress (Schubiger, 1958, Lee, 1960); headplus nucleus (Palmer, 1924); and tonetic stress plus numbers (Halliday,1963). None of these systems, as they stood, was adequate for the presenttask, either because the principles underlying the system were suspect(as in the case of transcriptions based on musical staves or numericallevels), or the notation involved too much detail (as in the use of funda-mental frequency), or it was not detailed enough, presenting no suggestionas to how the full range of prosodic variables in a language should benotated.

In choosing a system of transcription to reflect the prosodic featuresrecognised in the analysis, certain principles had to be borne in mind.Clearly, any such study requires a system of notation that is accurate andconsistent, but to be useful it must also allow large quantities of materialto be described with a high degree of facility-in other words, it must beas automatically applicable as possible# using a minimum of symbols,and grading the complexity of the symbols to reflect the differentdegrees of significance attached to different features of the data. Themore important and frequent a prosodic feature, the simpler should bethe notation used to refer to it (cf. Crystal Sc Quirk, 1964, p. 57). Froma practical viewpoint, the exponents of the analysis (and hence thetranscription) should not be given too narrow a phonetic definition, fortwo reasons: on the one hand, it is not normally possible (largely forreasons of available time, but also because the tape-recording processused tends to obscure phonetic minutiae) to discriminate narrow phoneticdetails of large amounts of connected speech on tape and provide acorrespondingly adequate notation; and on the other hand there is thetheoretical objection that a highly detailed notational method lays itselfopen to the same charges as were levelled against the acoustic techniqueabove-namely, that there would be too much activity reproduced in theresults, of various degrees of significance and non-significance, for thelinguist to be able to perceive clearly what is linguistically relevant andwhat is not. (There is also the point that the more detailed the notation,the more difficult it is to get a number of linguists to agree as to whatthey hear as exponents of prosodic features in any particular utterance.)The transcription, therefore, had to be fairly broad, covering only thoseaspects of vocal effect which were determinable as linguistically signi-

’ This requirement also involves facility in typographical reproduction: cf. the notationdevised to suit a modified typewriter keyboard used in Crystal & Quirk (1964).

2 CPS

Page 18: Crystal 01

18 1.3 Some preliminary considerations

iicant (see below), and had to be as immediate a pictorial stimulus as

possible for ease of reading and repeating. The transcription as it

ultimately developed tries to take account of all these principles: detailsare given in chapters 4 and 5, with explanatory reference to the morefamiliar interlinear phonetic transcription.

Finally, it is worth emphasising that I am attempting to discuss the

non-segmental linguistic contrasts in English in formal terms; I am notprimarily concerned with the referential nature of the meanings whichsuch contrasts may be said to carry. This point needs to be made, inview of the semantic emphasis (in a referential sense) of most past workin this field, but I take it to be such a regular part of modern linguisticmethodology as to require no argued defence. It is now agreed that aninadequate formal description will only produce distorted statements:it will either overlook significant contrasts altogether, or ascribe a faultyrelationship between a particular form and a particular situation or

group of situations (cf. above, p. 3). This is especially true for suchrelatively finite systems as intonation, where the meaning of items is

largely a function of the total number of alternatives available and the

total range of contrasts in the system in which the item operates. But anemphasis on formal exposition should not be interpreted as an attemptto exclude considerations of meaning completely. Most linguists wouldagree that statements of meaning are, in fact,. their ultimate business;and in any case, no analyst can totally exclude the fact of meaning fromhis mind While carrying out his research, but must work with an aware-ness of meaning similarities or contrasts in the language he is examining.All that emphasising a formal, as opposed to a ‘semantic’ or ‘notional’approach to description implies is that, procedurally, considerations of

meaning (either in the sense of determining the way in which the formalcontrasts correlate with lexical items which split up a particular semanticfield, or in the sense of defining the use of patterns in relation to extra-linguistic situations) do not enter in until a stable basis of formally

defined features has been determined. Then a more satisfactory classi-

iication of meanings can be carried out.Moreover, considerations of meaning enter in as criteria for discrimi-

nating between various kinds of formal contrast, as a method of indicat-ing where linguistic significance may be said to lie. For example, one

may make use of ai procedure similar to that often used in establishingsegmental, phonemic contrasts: if a given phonetic feature’s status is

uncertain, a decision as to whether it should be included as a functioning

Page 19: Crystal 01

1.3 Analytic procedures 19

part of a prosodic system or not can be reached by” asking whether thisfeature’s use can distinguish two otherwise identical utterances, so thatlinguistically untrained native speakerswould consistently maintain thetwo utterances as being in some sense ‘ different ’ in meaning. Conversely,if the addition of a given phonetic feature was found to produce no suchreaction, or a decision that the two utterances were the ‘same’, then thefeature would be seen as insignificant. Although there are a number ofdifliculties involved in this approach (such as the problem of grading‘ degrees ’ of sameness or difference, and the assumptions implicit in thismethod about the stability of our processes of perception), trying toascertain whether utterances are equivalent or non-equivalent in thisway is accepted as a valid use of the criterion of meaning in formalanalysis, as it in no case involves either native speaker or analyst instating wherein the sameness or difference lies in any absolute kind ofway, by trying to assess the meanings of the two utterances, or labellingtheml-and, as I shall suggest in chapter 7, it is in attempting to describethe meanings involved lexically that one finds the greatest difficulties.

X The experiment reported by Quirk & Crystal (1966) makes use of a sirnilar technique(see below, p. 203). Cf. also Harris (1951, pp. 186-95), who lays stress on the value ofusing meaning as a criterion in this Way, and Iassem (I952b, p. 29). “

2"2

Page 20: Crystal 01

Published by the Syndics of the Cambridge University PressThe Pitt Building, Trumpington Street, Cambridge CB2 IRP

Bentley House, 2oo Euston Road, London NWI 2DB

32 East 57th Street, New York, NY IOO22, USA296 Beaconsfield Parade, Middle Park, Melbourne 3206, Australia

© Cambridge University Press 1969

Library of Congress Catalogue Card Number 69-13792

ISBN o 521 07387 1 hard coversISBN o 521 29058 9

First published 1969Reprinted 1972, 1975

First paperback edition 1976

First printed in Great Britainat the University Printing House, Cambridge

Reprinted in Great Britain by Redwood Burn Limited,Trowbridge 8; Esher