3.1 introduction -...
TRANSCRIPT
76
3. Methodology
3.1 Introduction
As seen in Chapter Two, most earlier studies on the variable
marking of the accusative case marker have been done on a formal basis
without much regard to the empirical validity of their hypotheses. In
contrast to them, this dissertation is concerned with the actual variation
of the accusative case marking and its relation to the grammatical system
of the language in the speech community. For this purpose, I adopt
variation theory as a research framework in this project. The framework,
originating in Labov’s study of several phonological variables in New
York City (Labov 1966), has been adopted as an analytical basis for
studies of language variation and change (Labov et al. 1968, Wolfram
1969, Anshen 1969, Cofer 1972, Cedergren 1973, Jain 1973, Trudgill 1974,
Modaressi 1978, Rickford 1979, Oliveira 1980, Gambhir 1981, Abdel-
Jawad 1981, Ash 1982, Poplack 1982, Lennig 1983, Labov 1984, Hibiya
1988, Hong 1991, Matsuda 1993a, inter alia).
77
Variation theory is a way of analyzing language variation, and its
methodological procedures are describable as a set of principles
concerning the data quality (the vernacular principle), the demarcation of
the linguistic environments from which the data is collected (the
envelope of variation), the way it is counted or coded (the accountability
principle), and the way it is analyzed (the multiplicity of factors of
language variation and change).
The vernacular principle states that the most systematic picture of
the language is obtained through the observation of speech where the
least attention is paid to it by the speaker (Labov 1972). In order to obtain
such speech, a valid fieldwork method must be established. Section 3.1
will describe the interview method used in my fieldwork in Tôkyô.
In natural speech, however, linguistic variation occurs only in
specific environments; thus, distinctions must be made between where
the variation is possible, where it is not and where the variants cannot be
reliably distinguished before collecting the data. Because of this issue, the
envelope of variation, requires a detailed linguistic analysis, it will be
treated separately in the next chapter, “Delimiting the Envelope of
Variation for the Variable Marking of (o).”
78
The accountability principle reports the variability “with the
proportion of cases in which the form did occur in the relevant
environment, compared to the total number of cases in which it might
have occurred” (Labov 1969: 738). How will those environments be
coded, and why are those environments picked for examination in the
first place? Section 3.2 will discuss the factors examined in this project
and the linguistic and sociological reasoning behind them. The actual
sample used for this dissertation is described in Section 3.3.
Lastly, most previous empirical studies on language variation
report the existence of multiple constraints on the linguistic variable (the
multiplicity of factors of language variation and change). In order to
evaluate the effectiveness of each constraint in an objective manner, a
valid statistical method must be used. Section 3.4 introduces the
computer program GoldVarb, the solution adopted in this project.
3.2 Fieldwork
My speech samples were collected mostly during the summer
fieldwork from May 1990 to August 1993, using a Sony WM-D6C tape
79
recorder with two lavaliere microphones (Sony ECM-T150). I recruited
speakers from two main sources: networks from previous interviewees
(including the ones from Hibiya’s network) and a community center for
aged people. The former technique has the advantage of being accepted
easily by the speaker, reducing the psychological barrier that usually
exists between the interviewer and the interviewee.
The community centers are found in a number of locations in
Bunkyô ward, which is situated between the Yamanote and Shitamachi
areas. Run by the municipal office, they are used as a sort of salon for
aged people, where they get together to eat lunch, practice their hobbies,
take baths, have parties or just chat with their friends over tea. Its general
atmosphere is very relaxed. Here, I was able to make contacts and
ultimately was able to make a number of interviews with old speakers.
Most of the interviews were done on a one-on-one basis, following
the interview schedule based on the Q-GEN-II protocol developed by
William Labov, to which I added a number of original questions that are
most likely to induce long casual responses from the speakers for
historical, sociological or cultural reasons. For example, for aged people
in general, topics concerning WW II, especially the air raid, or the Kantô
80
Earthquake in 1923 were quite effective in eliciting narratives and other
kinds of spontaneous speeches, while such topics as school or music
generally served similar a purpose for younger speakers.
In addition to the one-on-one interview, I also conducted a series
of group sessions to collect intra-peer interactions (Labov 1984). Two
typical cases are sessions with junior high school students and the
sessions recorded at community centers with aged people. In both cases,
the interviewer’s role was kept to a minimum, only throwing out some
new topics when the chat had died down, or doing minor chores for
them. For most of these speakers, one-on-one interview recordings were
also obtained, achieving a maximum style differentiation which is one of
the main purposes of sociolinguistic interviews (Labov 1966).
An extension of the group session is a recording where a tape
recorder is left in some section of a house, and was kept recording for
hours as long as interactions were going on. This was typically used for
the meal table and family gatherings.
81
3.3 Variables Considered1
3.3.1 Internal Variables
The overview of past studies on the internal factors affecting the
variable marking of (o) in Tôkyô Japanese in Chapter Two revealed a
number of hypotheses that must be tested in the current study. Although
it is most desirable to enter all of those factors into a statistical procedure
whereby the most prominent (one)s are selected through empirical
testing of thousands of tokens from natural speech, such a procedure
would force one to keep coding some factors that are far off the mark for
thousands of tokens, a clearly inefficient strategy. The problem then boils
down to the balance between the exhaustiveness of the variables explored
and the cost of coding a number of factors.
In order to solve this problem, variable selection in this study is
done in two stages. First, with 2,000 tokens from 8 speakers (one from
each social group) the first statistical analysis was done. Those factors
that were deemed ineffective by bivariate and multivariate analysis were
1 In the variationist literature, it is customary to refer to what general statistics literature calls factors as factor groups, while referring to levels within a variable as factors.
82
ignored in further coding. In this phase, external factors are all ignored,
assuming the uniform effect of internal factors on speakers from all
segments of the speech community. Given the general independence
between the internal and external factors in variation phenomenon
(Labov 1982), such an assumption seems to be well grounded.
In the next phase, all and only the factors that were picked up in
the first phase were coded for the remainder of the corpus, and the final
variable selection was done together with the social variables. This
enabled me to locate the most prominent variables in the most efficient
way.
3.3.1.1 Linguistic Form of the Object NP
Given that the locus of marking is at the end of the object NP, it is
only natural that one of the prime suspects for influencing the rate of o-
marking is the object NP. As I showed in Section 2.9, HTU (1992: 1252)
found that marking is more frequent after common nouns than other
kinds of nouns (notably pronouns, after which the marking was rare).
Although the usage is somewhat confusing, I will keep using these terms in the following exposition.
83
One would then like to see finer differentiations of NP types: do all the
pronouns behave similarly? Or is it actually a subset of pronouns that
rarely allows marking? Among other NPs, what about clausal NPs? To
answer these questions, the object NP must be coded for its form —
wh/non-wh pronouns, NP and clausal NP.
(3.1) Coding Scheme for the Linguistic Form of the Object NP • Pronoun: •Wh-pro • Non-wh-pronouns:
• Anaphoric pronouns • Deictic Pronouns
• NP • Clausal NP
3.3.1.2 Final Particle
In view of Masunaga’s finding about the correlation of the
acceptability of o-marking and the presence of final particles (Masunaga
1988), it is necessary to pay attention to the particle as one of the
variables. As I described in Section 2.6 above, however, her deemphasis
analysis left it ambiguous whether the effect was due to its mechanical
84
presence (so that its meaning does not matter at all) or to its semantic
effect. Therefore, final particles will be coded for two features, form and
meaning:
(3.2) Coding Scheme for Final Particles: Form2
• No FP • -ne, -nee • -no • -sa, -saa • -k,a,-kai, -noka, -kke, -kasira • -yone • -yo • OTHERS
(3.3) Coding Scheme for Final Particles: Meaning
• Absent • Emphatic (-sa, -mon, -yo, -yoo, -monnee, -mononano, -noyo, -sa, -saa, -nee, -ne, -yone, -ze, -none, -n) • Confirmation/mitigation (-ne, -desyo(o), -zyan, -zyanai, -kedomo) • Logical consequence (-wake, -hazu, -wakene, -wakedesyoo)
2 It was necessary (even within the formal coding) to group several similar forms into one category to keep the number of factors within one factor group reasonably small.
85
• Rhetorical question (-zyan) • Female marker (-wa, -no) • Explanation (-no, -ne, -wake, -mono) • Question (-no, -ka, -nokana, -noka, -kana, -kanaa, -ne ) • Command (-na) • Exclamation (-na, -nonina, -nee, -ne) • Hearsay (-tte) • Reason (-mon)
3.3.1.3 Information Status of the Object NP
Masunaga’s deemphasis analysis also claimed that the shared
information was relevant in the acceptability of o-marking. As I noted
there, this notion needs to be made more precise. Here, I adopt Prince’s
Assumed Familiarity scheme (Prince 1981). Discarding the omniscient
observer’s viewpoint underlying the previous attempts to capture the
different status of various NP’s in discourse (e.g. shared knowledge),
Prince proposed to put the matter at the level of ordinary humans
involved in verbal interaction, who, on the speaker/writer’s side,
produce certain linguistic forms with certain assumptions about the
hearer/reader, and on the hearer/reader’s side, draw certain inferences
86
based on those forms. The information status of any given NP in a
discourse can only be described reasonably in the form of an assumption,
hence her new terminology Assumed Familiarity.
The Assumed Familiarity taxonomy has three major nodes: New,
Inferrable and Evoked, with each having several subcategories (see the
diagram in Figure 3.1). When a new entity is introduced into the
discourse, (i) the hearer must create a new entity or (ii) the hearer is
assumed to have that entity already in his/her discourse model. The first
case is Brand-New, and the latter one is Unused. The brand-new can be
also anchored, if the NP is linked to some other discourse entity.
Otherwise, it is unanchored. This enables one to differentiate the three
New NPs — a bus, a guy I work with and Noam Chomsky — in the following
examples:
(3.4) Brand-New/Unanchored [Prince’s (22c)] I got on a bus yesterday and the driver was drunk. (3.5) Brand-New/Anchored [Prince’s (22d)] A guy I work with says he knows your sister.
87
(3.6) Unused [Prince’s (22b)] Noam Chomsky went to Penn.
When a given NP has its discourse entity in the discourse-model, it
is evoked. Evoked entities can be of two types, textually evoked and
situationally evoked. The first type is where the hearer evoked the entity —
which was once New or Inferrable — on textual grounds, i.e. the NP was
already mentioned previously or it is inferrable from other previously
mentioned NPs. The second type is where the hearer knows that s/he can
evoke the entity him/herself for situational reasons (e.g. discourse
participants, salient contextual features, etc). Thus, he in (3.5) above is
textually evoked, while you in the following example is situationally
evoked:
(3.7) Situationally Evoked [Prince’s (22a)] Pardon, would you have change of a quarter?
88
In between New and Evoked stands the third category, Inferrable.
Prince (1981: 236) defines it as “[a] discourse entity is Inferrable if the
speaker assumes the hearer can infer it, via logical — or, more commonly,
plausible — reasoning, from discourse entities already Evoked or from
other Inferrables.” When this inference works by a set-member
relationship, it is called Containing Inferrable. In (3.4) above, the driver can
be inferred from a bus (since buses have drivers), hence its information
status becomes Inferrable, while in (3.8), one of these eggs is Containing
Inferrable.
(3.8) Containing Inferrable [Prince’s (22e)] Hey, one of these eggs is broken!
The relationship among these categories can be represented as in
Figure 3.1 below:
89
Assumed Familiarity
New
Brand-New Unused
Brand-New(Unanchored)
Brand-NewAnchored
(Noncontaining)Inferrable
ContainingInferrable
Inferrable Evoked
TextuallyEvoked
SituationallyEvoked
Figure 3.1Assumed Familiarity Hierarchy [Prince 1981: 237]
3.3.1.4 Word Order
Subsumed under the heading of word order is (i) the adjacency
between the verb and the object NP, as proposed by Tsutsui (1984) and
Saito (1985), and (ii) the presence or absence of the subject of the sentence,
a point that was discussed in the previous chapter. These two subfactors
are combined into one variable here, making a four-way coding scheme
below:
90
(3.9) Coding Scheme for Word Order
• Object NP-Verb Adjacent3 and Subject Present in the Sentence • Object NP-Verb Adjacent and Subject Absent in the Sentence • Object NP-Verb Non-Adjacent and Subject Present in the Sentence • Object NP-Verb Non-Adjacent and Subject Absent in the Sentence
3.3.1.5 Verbal Complexity
Masunaga’s observation about the complexity of the predicate
(Masunaga 1988) is integrated into this study by coding the complexity
in terms of the number of independent morphemes and the presence or
absence of passive (rare) and causative (sase) morphemes:
3 Note that quantifiers are regarded as a part of the object NP. It is worth noting that the quantitative analysis also confirmed the correctness of this assumption, locating no significant difference between object NPs with a quantifier and those without.
91
(3. 10) Coding Scheme for Verbal Complexity
• 1 Independent Morpheme • 2 Independent Morphemes • 3 Independent Morphemes • 4 Independent Morphemes • 1 Independent Morpheme + rare/sase • 2 Independent Morphemes + rare/sase • 3 Independent Morphemes + rare/sase
3.3.1.6 Phonology: Last Phoneme of the Preceding NP
In order to see whether the rate of o-marking is correlated with the
phonological shape of the object NP, I coded the last phoneme of the NP:
(3.11) Coding Scheme for Phonology
/a/, /e/, /i/, /o/, /u/, /N/4
4 /N/ represents a mora nasal (Vance 1987).
92
3.3.1.7 Animacy
The case marker o is also known to show an extensive nationwide
geographical variation. The maps in the Grammar Atlas of Japanese
Dialects (GAJ) (NLRI 1989) for the accusative case marker make it clear
that at least for the geographical distribution, the animacy of the object
NP plays an important role, so that the zero-marking is strongly
disfavored for animate object NPs, even for the dialects generally known
for favoring the zero-marking (e.g. Kansai dialects). Compare the
following two dialect maps, one for sake ‘sake drink’ as in the sentence
“That person drinks sake” and the other for ore ‘me (male)’ as in the
sentence “Please let me go with you” (the vertical bar symbol and the left-
angled symbols represent the areas where the accusative case marker
appears as zero):5
5 The maps for the Hokkaidô and the Okinawa areas are not included here. Also, it should be noted that GAJ includes another map depicting the distribution of the accusative case marker involving sonna koto ‘such a thing.’
93
Figure 3.2
Distribution of the Accusative Case Marker
after the Inanimate NP sake ‘sake’ [GAJ (NLRI 1989: 6)]
94
Figure 3.3
Distribution of the Accusative Case Marker
after the animate NP ore ‘me (male)’ [GAJ (NLRI 1989: 7)]
95
It is easy to see that for the inanimate object sake, more than half of
the whole country prefer a zero-marked form, while one can find only
handful of such points for the animate noun ore. Here, animacy indeed
works as a key distinction that brings about this radically different
distribution of the zero-marking of (o) across the country.
If animacy affects the geographical distribution, it is quite likely
that it also affects the distribution of the variable (o) in a single speech
community. In order to code for this factor, I set up the following coding
scheme:
(3.12) Coding Scheme for Animacy
• -animate • +animate -human • +animate +human • body part
3.3.1.8 Embeddedness
Empirical studies of historical changes and change in progress
have shown repeatedly that elements in embedded clauses are affected
96
later than those in main clauses (Hock 1988, Matsuda 1992a, 1993a).
Hooper and Thompson (1973) also found that, in syntax a smaller
number of transformational rules are applicable to embedded clauses
than to main clauses. All of these observations lead one to expect that o-
marking should be prohibited from occurring more in embedded clauses
than in main clauses.
(3.13) Coding Scheme for Embeddedness • Main clause • Adverbial clause • Gerund • Quote • Relative/Noun complement • Predicate complement • Double embedding • Triple embedding
3.3.1.9 Focus Particle
Focus particles are used as a focus device on the predicate in
Japanese sentences, appearing immediately before or after the accusative
97
case marker, adding such meanings as ‘only’ or ‘even’ as in the examples
below:
(3.14) Dakara sono maru -no tui-ta toko -dake -o so the circle GEN with place only ACC mi-nagara look while ‘So while you look at that encircled place only’
[IJ, FYU/9127-0-470]
(3.15) Warui koto -bakkari surunda-mono ... bad thing only do FP ‘Because (he) does only bad things’
[MT, FDO/8850-0-578]
(3.16) Ma, nedan-made miru koto aru -n - desu well price even look COMP there are COP -kedo-ne FP FP ‘Sometimes I even look at the prices’
[TH, MUO/9112-0-562]
A cursory look at the tokens led me to suspect that the presence of
the particle was strongly associated with the zero-marking of the
accusative case. The particles will be coded in the following manner:
98
(3.17) Coding Scheme for Focus Particle • -bakari, -bakkashi ‘only, just’ • -dake, -dakesika ‘only’ • -nanka, -nante ‘for example’ ‘say’ ‘the group of’ • -nomi ‘only’ • -sika ‘only’ • -sae ‘even’ • -sura ‘even’ • -demo ‘even’ ‘or the like’ • -made ‘even’
3.3.2 External Variables
As for the external variables, three kinds of speaker’s attributes
were coded: age, sex, and area of residence. Age was initially divided into
two groups, below and above 50 years old. The last factor group, area of
residence, refers to the downtown (shitamachi)/uptown (yamanote)
distinction mentioned in the preceding chapter, which, Hibiya (1988) and
other sociolinguistic surveys found to be a significant determiner for a
number of variables in Tôkyô Japanese.
99
(3.18) Coding Scheme for External Variables • Age (old/young) • Sex (m/f) • Area of Residence (downtown/uptown)
3.3.3 Style
Style, as noted by a number of researchers, is expected to be a
prominent factor group in my study. To ensure reliable coding of the
variable, I adopted Labov and Sankoff’s coding scheme for style, which
constructs a style tree with a series of binary distinctions (Labov and
Sankoff 1988):
100
Careful Casual
Response
Language
Soapbox
Careful
Narrative
group
Kids
Tangent
Figure 3.4
Labov and Sankoff’s Style Tree [Labov and Sankoff 1988]
As it appears, the distinctions are based on the topic (kids,
language, tangent, soapbox), the form of the speech (response, narrative)
or the number of participants (group). As such, the distinctions are made
more objective than the one in Labov (1966), which also relied on channel
101
cue (Labov 1966), though in general, the higher in the tree, the more
objective in standards (Labov in class lecture, Fall 1988).
3.4 Sample
In constructing the sample for this study, I set up an eight-cell
matrix based on three binary social variables: age, sex and residential
categories. The data collection was conducted so that each cell would be
filled by 3 - 5 speakers, with around 200 tokens of the linguistic variable
(o) from each. The actual number of speakers and tokens are as follows:
102
Old
(40 and over)
Young
(Below 40) TOTAL
Male Female Male Female
Down-town
963 945 838 945 3,691
Up- town
950 1,096 902 1,062 4,010
TOTAL 1,913 2,041 1,740 2,007 7,701
Table 3.1
Breakdown of the Tokens in the Corpus
Old
(40 and over)
Young
(Below 40) TOTAL
Male Female Male Female
Down-town
6 6 4 4 20
Up- town
4 5 3 5 17
TOTAL 10 11 7 9 37
Table 3.2
Breakdown of the Speakers in the Corpus
103
The sample was collected from three sources: my own fieldwork in
Tôkyô, data from Hibiya’s Tôkyô Japanese corpus (Hibiya 1988), and the
Housewife Corpus collected and compiled by Sachiko Ide and her
associates (Ide, Ikuta, Kawasaki, Hori and Haga 1984). The last source,
which transcribed an entire week’s interactions of a housewife from the
Yamanote area in Tôkyô with ample situational information, was used to
fill in the uptown old female cell.6
3.5 Analytical Methodology
Goals of inquiry determine the analytical methodology in any
scientific research. In this case, the problem boils down to the
construction of an optimal model of variation in the zero-marking of the
accusative case marker (o) in natural speech of Tôkyô Japanese speakers
on a minimal number of nominal variables. As the dependent variable is
binary, the problem is best solved by a statistical method known as the
logit model.
6 The Housewife Corpus is tagged for a number of types of situational/stylistic information, including interlocuter’s background information. They were all used when the tokens from the Corpus was coded for the style.
104
The logit model is a widely-used statistical method for a binary
dependent variable with multiple nominal independent variables, and it
is represented by the following general formula:
(3.19) ln(p/1-p) = p0/(1-p0) + Σ (ln(pij/(1-pij)))
where p is the probability of an occurrence of a certain variant, p0 is the
overall mean quite close to the similar notion in ANOVA, and pij is the
parameter value of a factor j in a factor group i. With estimated
parameter values, the model can generate the predicted rate of rule-
application, or in my case, the predicted rate of o-marking, for every
combination of independent factors. The set of those predicted values are
then compared with the observed rate of zero-marking by the chi-square
statistic to see the goodness of fit of the model to the data. For this
dissertation, the estimation of those parameters and the model selection
will be done by GoldVarb (Sankoff and Labov 1979, Rand and Sankoff
1989).
105
The actual analytical procedure for the analysis of (o) is as follows.
First, distributions by each independent factor are checked by raw
percentage, and marginal factors are merged with others (in a
linguistically reasonable way) or are thrown out of the corpus. The tokens
are then checked for any interaction among the factor groups by cross
tabulation. After that, a GoldVarb analysis was made which shows the
optimal model with minimal numbers of factor groups and the minimal
distinctions within them. The result of these procedures as applied to my
current corpus will be seen in Chapters Six and Seven. Before getting into
that, however, a more serious pre-analysis look at the data — namely
delimitation of the envelope of variation (Chapter Four) — and a brief
description of the speech community (Chapter Five) are in order.