3.1 introduction -...

76

3. Methodology

3.1 Introduction

As seen in Chapter Two, most earlier studies on the variable

marking of the accusative case marker have been done on a formal basis

without much regard to the empirical validity of their hypotheses. In

contrast to them, this dissertation is concerned with the actual variation

of the accusative case marking and its relation to the grammatical system

of the language in the speech community. For this purpose, I adopt

variation theory as a research framework in this project. The framework,

originating in Labov’s study of several phonological variables in New

York City (Labov 1966), has been adopted as an analytical basis for

studies of language variation and change (Labov et al. 1968, Wolfram

1969, Anshen 1969, Cofer 1972, Cedergren 1973, Jain 1973, Trudgill 1974,

Modaressi 1978, Rickford 1979, Oliveira 1980, Gambhir 1981, Abdel-

Jawad 1981, Ash 1982, Poplack 1982, Lennig 1983, Labov 1984, Hibiya

1988, Hong 1991, Matsuda 1993a, inter alia).

77

Variation theory is a way of analyzing language variation, and its

methodological procedures are describable as a set of principles

concerning the data quality (the vernacular principle), the demarcation of

the linguistic environments from which the data is collected (the

envelope of variation), the way it is counted or coded (the accountability

principle), and the way it is analyzed (the multiplicity of factors of

language variation and change).

The vernacular principle states that the most systematic picture of

the language is obtained through the observation of speech where the

least attention is paid to it by the speaker (Labov 1972). In order to obtain

such speech, a valid fieldwork method must be established. Section 3.1

will describe the interview method used in my fieldwork in Tôkyô.

In natural speech, however, linguistic variation occurs only in

specific environments; thus, distinctions must be made between where

the variation is possible, where it is not and where the variants cannot be

reliably distinguished before collecting the data. Because of this issue, the

envelope of variation, requires a detailed linguistic analysis, it will be

treated separately in the next chapter, “Delimiting the Envelope of

Variation for the Variable Marking of (o).”

78

The accountability principle reports the variability “with the

proportion of cases in which the form did occur in the relevant

environment, compared to the total number of cases in which it might

have occurred” (Labov 1969: 738). How will those environments be

coded, and why are those environments picked for examination in the

first place? Section 3.2 will discuss the factors examined in this project

and the linguistic and sociological reasoning behind them. The actual

sample used for this dissertation is described in Section 3.3.

Lastly, most previous empirical studies on language variation

report the existence of multiple constraints on the linguistic variable (the

multiplicity of factors of language variation and change). In order to

evaluate the effectiveness of each constraint in an objective manner, a

valid statistical method must be used. Section 3.4 introduces the

computer program GoldVarb, the solution adopted in this project.

3.2 Fieldwork

My speech samples were collected mostly during the summer

fieldwork from May 1990 to August 1993, using a Sony WM-D6C tape

79

recorder with two lavaliere microphones (Sony ECM-T150). I recruited

speakers from two main sources: networks from previous interviewees

(including the ones from Hibiya’s network) and a community center for

aged people. The former technique has the advantage of being accepted

easily by the speaker, reducing the psychological barrier that usually

exists between the interviewer and the interviewee.

The community centers are found in a number of locations in

Bunkyô ward, which is situated between the Yamanote and Shitamachi

areas. Run by the municipal office, they are used as a sort of salon for

aged people, where they get together to eat lunch, practice their hobbies,

take baths, have parties or just chat with their friends over tea. Its general

atmosphere is very relaxed. Here, I was able to make contacts and

ultimately was able to make a number of interviews with old speakers.

Most of the interviews were done on a one-on-one basis, following

the interview schedule based on the Q-GEN-II protocol developed by

William Labov, to which I added a number of original questions that are

most likely to induce long casual responses from the speakers for

historical, sociological or cultural reasons. For example, for aged people

in general, topics concerning WW II, especially the air raid, or the Kantô

80

Earthquake in 1923 were quite effective in eliciting narratives and other

kinds of spontaneous speeches, while such topics as school or music

generally served similar a purpose for younger speakers.

In addition to the one-on-one interview, I also conducted a series

of group sessions to collect intra-peer interactions (Labov 1984). Two

typical cases are sessions with junior high school students and the

sessions recorded at community centers with aged people. In both cases,

the interviewer’s role was kept to a minimum, only throwing out some

new topics when the chat had died down, or doing minor chores for

them. For most of these speakers, one-on-one interview recordings were

also obtained, achieving a maximum style differentiation which is one of

the main purposes of sociolinguistic interviews (Labov 1966).

An extension of the group session is a recording where a tape

recorder is left in some section of a house, and was kept recording for

hours as long as interactions were going on. This was typically used for

the meal table and family gatherings.

81

3.3 Variables Considered1

3.3.1 Internal Variables

The overview of past studies on the internal factors affecting the

variable marking of (o) in Tôkyô Japanese in Chapter Two revealed a

number of hypotheses that must be tested in the current study. Although

it is most desirable to enter all of those factors into a statistical procedure

whereby the most prominent (one)s are selected through empirical

testing of thousands of tokens from natural speech, such a procedure

would force one to keep coding some factors that are far off the mark for

thousands of tokens, a clearly inefficient strategy. The problem then boils

down to the balance between the exhaustiveness of the variables explored

and the cost of coding a number of factors.

In order to solve this problem, variable selection in this study is

done in two stages. First, with 2,000 tokens from 8 speakers (one from

each social group) the first statistical analysis was done. Those factors

that were deemed ineffective by bivariate and multivariate analysis were

1 In the variationist literature, it is customary to refer to what general statistics literature calls factors as factor groups, while referring to levels within a variable as factors.

82

ignored in further coding. In this phase, external factors are all ignored,

assuming the uniform effect of internal factors on speakers from all

segments of the speech community. Given the general independence

between the internal and external factors in variation phenomenon

(Labov 1982), such an assumption seems to be well grounded.

In the next phase, all and only the factors that were picked up in

the first phase were coded for the remainder of the corpus, and the final

variable selection was done together with the social variables. This

enabled me to locate the most prominent variables in the most efficient

way.

3.3.1.1 Linguistic Form of the Object NP

Given that the locus of marking is at the end of the object NP, it is

only natural that one of the prime suspects for influencing the rate of o-

marking is the object NP. As I showed in Section 2.9, HTU (1992: 1252)

found that marking is more frequent after common nouns than other

kinds of nouns (notably pronouns, after which the marking was rare).

Although the usage is somewhat confusing, I will keep using these terms in the following exposition.

83

One would then like to see finer differentiations of NP types: do all the

pronouns behave similarly? Or is it actually a subset of pronouns that

rarely allows marking? Among other NPs, what about clausal NPs? To

answer these questions, the object NP must be coded for its form —

wh/non-wh pronouns, NP and clausal NP.

(3.1) Coding Scheme for the Linguistic Form of the Object NP • Pronoun: •Wh-pro • Non-wh-pronouns:

• Anaphoric pronouns • Deictic Pronouns

• NP • Clausal NP

3.3.1.2 Final Particle

In view of Masunaga’s finding about the correlation of the

acceptability of o-marking and the presence of final particles (Masunaga

1988), it is necessary to pay attention to the particle as one of the

variables. As I described in Section 2.6 above, however, her deemphasis

analysis left it ambiguous whether the effect was due to its mechanical

84

presence (so that its meaning does not matter at all) or to its semantic

effect. Therefore, final particles will be coded for two features, form and

meaning:

(3.2) Coding Scheme for Final Particles: Form2

• No FP • -ne, -nee • -no • -sa, -saa • -k,a,-kai, -noka, -kke, -kasira • -yone • -yo • OTHERS

(3.3) Coding Scheme for Final Particles: Meaning

• Absent • Emphatic (-sa, -mon, -yo, -yoo, -monnee, -mononano, -noyo, -sa, -saa, -nee, -ne, -yone, -ze, -none, -n) • Confirmation/mitigation (-ne, -desyo(o), -zyan, -zyanai, -kedomo) • Logical consequence (-wake, -hazu, -wakene, -wakedesyoo)

2 It was necessary (even within the formal coding) to group several similar forms into one category to keep the number of factors within one factor group reasonably small.

85

• Rhetorical question (-zyan) • Female marker (-wa, -no) • Explanation (-no, -ne, -wake, -mono) • Question (-no, -ka, -nokana, -noka, -kana, -kanaa, -ne ) • Command (-na) • Exclamation (-na, -nonina, -nee, -ne) • Hearsay (-tte) • Reason (-mon)

3.3.1.3 Information Status of the Object NP

Masunaga’s deemphasis analysis also claimed that the shared

information was relevant in the acceptability of o-marking. As I noted

there, this notion needs to be made more precise. Here, I adopt Prince’s

Assumed Familiarity scheme (Prince 1981). Discarding the omniscient

observer’s viewpoint underlying the previous attempts to capture the

different status of various NP’s in discourse (e.g. shared knowledge),

Prince proposed to put the matter at the level of ordinary humans

involved in verbal interaction, who, on the speaker/writer’s side,

produce certain linguistic forms with certain assumptions about the

hearer/reader, and on the hearer/reader’s side, draw certain inferences

86

based on those forms. The information status of any given NP in a

discourse can only be described reasonably in the form of an assumption,

hence her new terminology Assumed Familiarity.

The Assumed Familiarity taxonomy has three major nodes: New,

Inferrable and Evoked, with each having several subcategories (see the

diagram in Figure 3.1). When a new entity is introduced into the

discourse, (i) the hearer must create a new entity or (ii) the hearer is

assumed to have that entity already in his/her discourse model. The first

case is Brand-New, and the latter one is Unused. The brand-new can be

also anchored, if the NP is linked to some other discourse entity.

Otherwise, it is unanchored. This enables one to differentiate the three

New NPs — a bus, a guy I work with and Noam Chomsky — in the following

examples:

(3.4) Brand-New/Unanchored [Prince’s (22c)] I got on a bus yesterday and the driver was drunk. (3.5) Brand-New/Anchored [Prince’s (22d)] A guy I work with says he knows your sister.

87

(3.6) Unused [Prince’s (22b)] Noam Chomsky went to Penn.

When a given NP has its discourse entity in the discourse-model, it

is evoked. Evoked entities can be of two types, textually evoked and

situationally evoked. The first type is where the hearer evoked the entity —

which was once New or Inferrable — on textual grounds, i.e. the NP was

already mentioned previously or it is inferrable from other previously

mentioned NPs. The second type is where the hearer knows that s/he can

evoke the entity him/herself for situational reasons (e.g. discourse

participants, salient contextual features, etc). Thus, he in (3.5) above is

textually evoked, while you in the following example is situationally

evoked:

(3.7) Situationally Evoked [Prince’s (22a)] Pardon, would you have change of a quarter?

88

In between New and Evoked stands the third category, Inferrable.

Prince (1981: 236) defines it as “[a] discourse entity is Inferrable if the

speaker assumes the hearer can infer it, via logical — or, more commonly,

plausible — reasoning, from discourse entities already Evoked or from

other Inferrables.” When this inference works by a set-member

relationship, it is called Containing Inferrable. In (3.4) above, the driver can

be inferred from a bus (since buses have drivers), hence its information

status becomes Inferrable, while in (3.8), one of these eggs is Containing

Inferrable.

(3.8) Containing Inferrable [Prince’s (22e)] Hey, one of these eggs is broken!

The relationship among these categories can be represented as in

Figure 3.1 below:

89

Assumed Familiarity

New

Brand-New Unused

Brand-New(Unanchored)

Brand-NewAnchored

(Noncontaining)Inferrable

ContainingInferrable

Inferrable Evoked

TextuallyEvoked

SituationallyEvoked

Figure 3.1Assumed Familiarity Hierarchy [Prince 1981: 237]

3.3.1.4 Word Order

Subsumed under the heading of word order is (i) the adjacency

between the verb and the object NP, as proposed by Tsutsui (1984) and

Saito (1985), and (ii) the presence or absence of the subject of the sentence,

a point that was discussed in the previous chapter. These two subfactors

are combined into one variable here, making a four-way coding scheme

below:

90

(3.9) Coding Scheme for Word Order

• Object NP-Verb Adjacent3 and Subject Present in the Sentence • Object NP-Verb Adjacent and Subject Absent in the Sentence • Object NP-Verb Non-Adjacent and Subject Present in the Sentence • Object NP-Verb Non-Adjacent and Subject Absent in the Sentence

3.3.1.5 Verbal Complexity

Masunaga’s observation about the complexity of the predicate

(Masunaga 1988) is integrated into this study by coding the complexity

in terms of the number of independent morphemes and the presence or

absence of passive (rare) and causative (sase) morphemes:

3 Note that quantifiers are regarded as a part of the object NP. It is worth noting that the quantitative analysis also confirmed the correctness of this assumption, locating no significant difference between object NPs with a quantifier and those without.

91

(3. 10) Coding Scheme for Verbal Complexity

• 1 Independent Morpheme • 2 Independent Morphemes • 3 Independent Morphemes • 4 Independent Morphemes • 1 Independent Morpheme + rare/sase • 2 Independent Morphemes + rare/sase • 3 Independent Morphemes + rare/sase

3.3.1.6 Phonology: Last Phoneme of the Preceding NP

In order to see whether the rate of o-marking is correlated with the

phonological shape of the object NP, I coded the last phoneme of the NP:

(3.11) Coding Scheme for Phonology

/a/, /e/, /i/, /o/, /u/, /N/4

4 /N/ represents a mora nasal (Vance 1987).

92

3.3.1.7 Animacy

The case marker o is also known to show an extensive nationwide

geographical variation. The maps in the Grammar Atlas of Japanese

Dialects (GAJ) (NLRI 1989) for the accusative case marker make it clear

that at least for the geographical distribution, the animacy of the object

NP plays an important role, so that the zero-marking is strongly

disfavored for animate object NPs, even for the dialects generally known

for favoring the zero-marking (e.g. Kansai dialects). Compare the

following two dialect maps, one for sake ‘sake drink’ as in the sentence

“That person drinks sake” and the other for ore ‘me (male)’ as in the

sentence “Please let me go with you” (the vertical bar symbol and the left-

angled symbols represent the areas where the accusative case marker

appears as zero):5

5 The maps for the Hokkaidô and the Okinawa areas are not included here. Also, it should be noted that GAJ includes another map depicting the distribution of the accusative case marker involving sonna koto ‘such a thing.’

93

Figure 3.2

Distribution of the Accusative Case Marker

after the Inanimate NP sake ‘sake’ [GAJ (NLRI 1989: 6)]

94

Figure 3.3

Distribution of the Accusative Case Marker

after the animate NP ore ‘me (male)’ [GAJ (NLRI 1989: 7)]

95

It is easy to see that for the inanimate object sake, more than half of

the whole country prefer a zero-marked form, while one can find only

handful of such points for the animate noun ore. Here, animacy indeed

works as a key distinction that brings about this radically different

distribution of the zero-marking of (o) across the country.

If animacy affects the geographical distribution, it is quite likely

that it also affects the distribution of the variable (o) in a single speech

community. In order to code for this factor, I set up the following coding

scheme:

(3.12) Coding Scheme for Animacy

• -animate • +animate -human • +animate +human • body part

3.3.1.8 Embeddedness

Empirical studies of historical changes and change in progress

have shown repeatedly that elements in embedded clauses are affected

96

later than those in main clauses (Hock 1988, Matsuda 1992a, 1993a).

Hooper and Thompson (1973) also found that, in syntax a smaller

number of transformational rules are applicable to embedded clauses

than to main clauses. All of these observations lead one to expect that o-

marking should be prohibited from occurring more in embedded clauses

than in main clauses.

(3.13) Coding Scheme for Embeddedness • Main clause • Adverbial clause • Gerund • Quote • Relative/Noun complement • Predicate complement • Double embedding • Triple embedding

3.3.1.9 Focus Particle

Focus particles are used as a focus device on the predicate in

Japanese sentences, appearing immediately before or after the accusative

97

case marker, adding such meanings as ‘only’ or ‘even’ as in the examples

below:

(3.14) Dakara sono maru -no tui-ta toko -dake -o so the circle GEN with place only ACC mi-nagara look while ‘So while you look at that encircled place only’

[IJ, FYU/9127-0-470]

(3.15) Warui koto -bakkari surunda-mono ... bad thing only do FP ‘Because (he) does only bad things’

[MT, FDO/8850-0-578]

(3.16) Ma, nedan-made miru koto aru -n - desu well price even look COMP there are COP -kedo-ne FP FP ‘Sometimes I even look at the prices’

[TH, MUO/9112-0-562]

A cursory look at the tokens led me to suspect that the presence of

the particle was strongly associated with the zero-marking of the

accusative case. The particles will be coded in the following manner:

98

(3.17) Coding Scheme for Focus Particle • -bakari, -bakkashi ‘only, just’ • -dake, -dakesika ‘only’ • -nanka, -nante ‘for example’ ‘say’ ‘the group of’ • -nomi ‘only’ • -sika ‘only’ • -sae ‘even’ • -sura ‘even’ • -demo ‘even’ ‘or the like’ • -made ‘even’

3.3.2 External Variables

As for the external variables, three kinds of speaker’s attributes

were coded: age, sex, and area of residence. Age was initially divided into

two groups, below and above 50 years old. The last factor group, area of

residence, refers to the downtown (shitamachi)/uptown (yamanote)

distinction mentioned in the preceding chapter, which, Hibiya (1988) and

other sociolinguistic surveys found to be a significant determiner for a

number of variables in Tôkyô Japanese.

99

(3.18) Coding Scheme for External Variables • Age (old/young) • Sex (m/f) • Area of Residence (downtown/uptown)

3.3.3 Style

Style, as noted by a number of researchers, is expected to be a

prominent factor group in my study. To ensure reliable coding of the

variable, I adopted Labov and Sankoff’s coding scheme for style, which

constructs a style tree with a series of binary distinctions (Labov and

Sankoff 1988):

100

Careful Casual

Response

Language

Soapbox

Careful

Narrative

group

Kids

Tangent

Figure 3.4

Labov and Sankoff’s Style Tree [Labov and Sankoff 1988]

As it appears, the distinctions are based on the topic (kids,

language, tangent, soapbox), the form of the speech (response, narrative)

or the number of participants (group). As such, the distinctions are made

more objective than the one in Labov (1966), which also relied on channel

101

cue (Labov 1966), though in general, the higher in the tree, the more

objective in standards (Labov in class lecture, Fall 1988).

3.4 Sample

In constructing the sample for this study, I set up an eight-cell

matrix based on three binary social variables: age, sex and residential

categories. The data collection was conducted so that each cell would be

filled by 3 - 5 speakers, with around 200 tokens of the linguistic variable

(o) from each. The actual number of speakers and tokens are as follows:

102

Old

(40 and over)

Young

(Below 40) TOTAL

Male Female Male Female

Down-town

963 945 838 945 3,691

Up- town

950 1,096 902 1,062 4,010

TOTAL 1,913 2,041 1,740 2,007 7,701

Table 3.1

Breakdown of the Tokens in the Corpus

Old

(40 and over)

Young

(Below 40) TOTAL

Male Female Male Female

Down-town

6 6 4 4 20

Up- town

4 5 3 5 17

TOTAL 10 11 7 9 37

Table 3.2

Breakdown of the Speakers in the Corpus

103

The sample was collected from three sources: my own fieldwork in

Tôkyô, data from Hibiya’s Tôkyô Japanese corpus (Hibiya 1988), and the

Housewife Corpus collected and compiled by Sachiko Ide and her

associates (Ide, Ikuta, Kawasaki, Hori and Haga 1984). The last source,

which transcribed an entire week’s interactions of a housewife from the

Yamanote area in Tôkyô with ample situational information, was used to

fill in the uptown old female cell.6

3.5 Analytical Methodology

Goals of inquiry determine the analytical methodology in any

scientific research. In this case, the problem boils down to the

construction of an optimal model of variation in the zero-marking of the

accusative case marker (o) in natural speech of Tôkyô Japanese speakers

on a minimal number of nominal variables. As the dependent variable is

binary, the problem is best solved by a statistical method known as the

logit model.

6 The Housewife Corpus is tagged for a number of types of situational/stylistic information, including interlocuter’s background information. They were all used when the tokens from the Corpus was coded for the style.

104

The logit model is a widely-used statistical method for a binary

dependent variable with multiple nominal independent variables, and it

is represented by the following general formula:

(3.19) ln(p/1-p) = p0/(1-p0) + Σ (ln(pij/(1-pij)))

where p is the probability of an occurrence of a certain variant, p0 is the

overall mean quite close to the similar notion in ANOVA, and pij is the

parameter value of a factor j in a factor group i. With estimated

parameter values, the model can generate the predicted rate of rule-

application, or in my case, the predicted rate of o-marking, for every

combination of independent factors. The set of those predicted values are

then compared with the observed rate of zero-marking by the chi-square

statistic to see the goodness of fit of the model to the data. For this

dissertation, the estimation of those parameters and the model selection

will be done by GoldVarb (Sankoff and Labov 1979, Rand and Sankoff

1989).

105

The actual analytical procedure for the analysis of (o) is as follows.

First, distributions by each independent factor are checked by raw

percentage, and marginal factors are merged with others (in a

linguistically reasonable way) or are thrown out of the corpus. The tokens

are then checked for any interaction among the factor groups by cross

tabulation. After that, a GoldVarb analysis was made which shows the

optimal model with minimal numbers of factor groups and the minimal

distinctions within them. The result of these procedures as applied to my

current corpus will be seen in Chapters Six and Seven. Before getting into

that, however, a more serious pre-analysis look at the data — namely

delimitation of the envelope of variation (Chapter Four) — and a brief

description of the speech community (Chapter Five) are in order.

3.1 introduction -...

Documents