Download - Voice Onset Time of Syllable-Initial Stops in Sixian …jntnu.ord.ntnu.edu.tw/Uploads/Papers/635222655661200000.pdfMing-Chung Cheng Voice Onset Time of Syllable-Initial Stops in Sixian

Ming-Chung Cheng Voice Onset Time of Syllable-Initial Stops in Sixian Hakka ◆　193　◆

Voice Onset Time of Syllable-Initial Stops in Sixian Hakka: Isolated Syllables

Ming-Chung ChengInstitute of Hakka Language and Communication

National United University Associate Professor

Abstract

This study investigates the “voice onset time” (VOT) variation in syllable-initial stops of Sixian

Hakka in isolated syllables, according to factors including place of articulation (PoA), following

vowel context, and the speakers’ age and gender. Thirty-six participants provided speech recordings.

They were required to repeat a randomized list of 36 speech stimuli constructed by using 6 syllable-

initial stops [p, t, k, ph, th, kh] preceding 3 corner vowels [i, u, a] in 2 high-level tones [55, 55]. A total

of 3,888 samples were measured for VOT values in PRAAT. The research results showed that the

factors other than gender affected the VOT production of Hakka syllable-initial stops. In accordance

with observations in previous sociolinguistic studies of gender, gender differences concerning the

VOT absolute values were also observed despite showing no statistically significant differences.

Furthermore, the factorial interaction between the stops’ PoA and the following vowel context, which

was discussed in detail, showed that vowel height has a stronger effect on VOT than vowel frontness.

To conclude, this study substantially contributes to the understanding of VOT and its variation in

Hakka, and to further comparisons of the VOT systems of stops between Hakka and other Chinese

dialects or foreign languages, particularly for foreign spouses who immigrated to Hakka villages.

Keywords: age, gender, Hakka, stop, voice onset time

Corresponding author: Ming-Chung Cheng, E-mail: [email protected] received: Feb. 25, 2013; Modified: Apr. 11, 2013; Accepted: Apr. 12, 2013doi: 10.6210/JNTNULL.2013.58(2).08

z-vc272-08-鄭明中.indd 193 2013/10/28 上午 11:53:03

◆　194　◆ Voice Onset Time of Syllable-Initial Stops in Sixian Hakka Ming-Chung Cheng

1. Introduction

Stops (or plosives) are a significant group of consonants that exists in all world languages

(Henton, Ladefoged, and Maddieson 1992; Maddieson and Ladefoged, 1996).1 Articulatorily,

producing stops requires that the airstream in the vocal tract be totally blocked and the closure be

briefly held. Once the blockage is released, the airstream behind the oral constriction suddenly

escapes from the mouth. The temporal interval between the release of the constriction and the onset

of vocal cord vibration is known as voice onset time (VOT) (Klatt 1975; Lisker and Abramson 1964;

Zlatin 1974).2 VOT has long been viewed as a reliable temporal-acoustic measure for the distinction

of voicing and aspiration in stops, and can be generally classified into three categories, voice lead,

short lag and long lag, depending on whether voicing proceeds or follows the release burst (voiced

vs. voiceless) and how long the airflow release lasts (aspirated vs. unaspirated). In principle, stops

are able to be uttered in any places in the vocal tract, but they tend to cross-linguistically occur in

three common places: bilabial, alveolar, and velar (Gamkerlidze 1978; Maddieson 1984).

The current study has three purposes. First of all, this study aims to investigate VOT and

its variation of syllable-initial stops in Hakka. Due to the universal existence of stops, and the

convenient assess and great progress of modern technology in acoustic analysis, a great body of

literature has been conducted to investigate the VOT issues in recent decades, such as (1) cross-

linguistic surveys (e.g., Cho and Ladefoged 1999; Keating, Linker, and Huffman 1983; Shimizu

1996; Wu 2009), (2) language-specific research (e.g., Cho, Jun, and Ladefoged 2002; Docherty

1992; Gósy 2001; Han and Weitzman 1970; Helgason and Ringen 2008; Kong, Beckman, and

Edwards 2011; Nielsen, 2011; Öğüt et al. 2006; Riney et al. 2007; Ringen and Suomi 2012; Rosner

et al. 2000; Williams 1977), (3) second and foreign language teaching and learning (e.g., Antoniou

et al. 2011; Chao and Chen 2008; Chao, Khattab, and Chen 2006; Chung 2010; Flege and Eefting

1987; Hazan and Boulakia 1993; Kang and Guion 2006; Kehoe, Lleó, and Rakow 2004; Khattab

1 For possible reasons for the cross-language existence of stops, see Ran’s (2008, 67-70) explanations. 2 Besides VOT, other acoustic characteristics for distinguishing stops in different places of articulation (PoAs)

are spectral peaks of the release bursts, formant transitions and silent periods. Stops in different PoAs will have their spectral energies centralized in different frequency ranges: bilabials (500~1500 Hz), alveolars (over 4000 Hz) and velars (1500~4000 Hz) (T. Lin and Wang 2005; Tsao 1996; Wu and Lin 1989). Likewise, stops of different PoAs have an influence on the formant transitions of the following vowels (Wu and Lin 1989). As for silent gaps, they are the indispensible ingredient for stops and crucial element for perceiving stops (Liang 2001; Ren 1981). In general, the lengths of stops’ silent periods correspond inversely to their PoAs. To be specific, the more posterior the stop’s PoA is, the shorter the silent period will be. For detailed discussion, refer to Ran (2008).

z-vc272-08-鄭明中.indd 194 2013/10/28 上午 11:53:03


2000; Liao 2005; Ogasawara 2011; Riney and Takagi 1999; Simon 2010; Thornburgh and Ryalls

1998; Yavaş and Wildermuth 2006), and (4) clinical and pathological practice (e.g., Auzou et al.

2000; Baker et al. 2007; Fischer and Goberman 2010; Jäncke 1994; Khouw and Ciocca 2007; Liu et

al. 2007, 2008; Ng and Wong 2009; Ringen and Suomi 2012; Tseng 1994; Tsui and Ciocca 2000).

Despite the wide research, most of the literature concentrated on western languages (e.g., English

and French). Comparatively, little attention is guided to Chinese dialects / varieties.3 For this reason,

this study is specific to Hakka, with an aim to complementing the paucity of Hakka VOT studies.

The second purpose of this study is to decide the exact VOT categories to which Hakka stops

belong. There are six voiceless stops in Hakka, appearing in three places (bilabial [p, ph], alveolar

[t, th], and velar [k, kh]), and in two manners (unaspirated vs. aspirated). These stops are cross-

linguistically common, but a systematic, large-scale investigation of the VOT variation is still in

urgent demand for the following reasons. First, it is broadly reported that VOT varies under many

affecting factors (see section 2 for a review). Yet, how these factors influence the VOT production

in Hakka stops is scarcely examined, let along their extent and interaction.4 Second, VOT varies

across languages, falling into types along the VOT continuum on a language-particular basis. Cho

and Ladefoged (1999), for example, separate voiceless unaspirated and aspirated stops into four

3 Mandarin Chinese seems to be an exception. A considerable amount of research (Liao 2005; Peng 2009; Ran 2008; Rochet and Fei 1991; Wu 2004; Wu and Lin 1989; among others) has explored the VOT issue in Mandarin Chinese, irrespective of the small number of participants in most of these studies. In addition, on account of the worldwise popularity of learning Mandarin Chinese as a foreign or second language, there are more and more studies examining production problems of stops by comparing VOT between Mandarin Chinese and these foreigners’ native languages, like English (Chao and Chen 2008; Shi and Liao 1986; Wang, Chen, and Tsai 2006), German (Wen, Ran, and Shi 2009), Japanese (Wang and Shangguan 2004), Korean (Gao 2001), Indonesian (Y.-G. Lin and Wang 2005), and so forth.

4 To my knowledge, few studies investigated the VOT production of Hakka stops, except Liang (2005) and Peng (2009). On the basis of southern Sixian Hakka, Liang (2005) probed the VOT production of syllable-initial stops by 20 participants (10 males and 10 females) with their ages ranging from 27 to 72. Not aiming specifically at stops’ VOT, he simply provided a general VOT exploration of Hakka stops of different places of articulation, without taking other affecting factors into consideration. Peng (2009) collected 11 males’ and 10 females’ production of six syllable-initial stops in Sixian Hakka followed by [i, a, u], and examined the VOT variation under factors, such as place of articulation, vowel context, tone and gender. In comparison with Liang (2005), she supplied more detailed explanations of the VOT variation in Hakka stops, but there are some weaknesses awaiting improvements. First of all, similar to Liang, the age ranges of her participants were extremely wide (36~80 years old), so she was unable to take the age factor into consideration. Second, there were only 21 subjects (10 males and 11 females) taking part in her study. Third, she claimed that the participants were all native speakers of Sixian Hakka, but their residences were not clearly stated. It is well-known that Hakka is always in linguistic contact with, and is influenced by, Southern Min and Mandarin Chinese in Taiwan. This may give rise to a negative effect on VOT. For this reason, this study will recruit participants inhabiting in Miaoli County, the most intensely populated area by speakers of (northern) Sixian Hakka. Hakka is also the daily language used by these people recruited in this study, so the homogeneity of the participants can be ensured, and the reliability of the study can be guaranteed.

z-vc272-08-鄭明中.indd 195 2013/10/28 上午 11:53:04


categories: unaspirated (0~30 milliseconds, ms), slightly aspirated (around 50 ms), aspirated (around

90 ms), and highly aspirated (over 90 ms). On the basis of Cho and Ladefoged’s elaborate typology,

which VOT categories do the roughly-divided Hakka stops belong to? Obviously, both issues wait

for further probations.

The third purpose of this study is to set up a VOT reference norm for foreigners’ learning

Hakka or instructors’ teaching Hakka. In recent decades, plenty of foreigners, called foreign spouses,

usually females, have married to Hakkas, and immigrated into Hakka villages. On account of their

diverse language backgrounds (e.g., Vietnamese, Indonesia, Thai, Filipino (Tagalog), etc.), there are

differences in the systems of stop consonants. For instance, though there exist [p, t, k] in Vietnamese,

[p] can not occur syllable-initially. Moreover, there are no aspirated stops in Tagalog. In addition

to [p, t, k, ph, th, kh], there are voiced stops [b, d] in Thai which are lacking in Hakka. All these

differences in the stop systems across languages make it a must to well survey Hakka stops’s VOT

in order to mutually compare the stop systems of foreign languages with Hakka, give the Hakka

instructors a reference standard and help these foreign spouses learn Hakka well. Actually, the issue

of stop learning problems between Mandarin and foreign languages has been extensively compared

(see footnote 3 for some literature), but similar explorations are still lacking between Hakka and

foreign languages, except English. The study hopes to set up a VOT reference norm. With this norm,

in addition to Hakka language learning / teaching, Hakka-language studies can be further extended

to interrelated topics, like speech recognition and synthesis, pathological treatment, and Hakka as a

second / foreign language for foreigners, all of which have been always missing from Hakka studies,

and call for further examinations.

In view of the purposes stated above, conducting a VOT study of Hakka stops is quite

necessary, and motivates this study. This study is organized as follows. Section 2 offers a concise

review of the factors that affect VOT. Section 3 presents the methods in this study, inclusive of

participants, recording stimuli, speech recording procedure, and acoustic and statistical analysis.

Section 4 shows the results of the effects on VOT resulting from the factors in investigation. Section

5 discusses the experimental results generally, along with special attention given to the interaction

between stops’ place of articulation and following vowel context. Section 6 concludes this study,

together with some suggestions for future research.

2. Affecting Factors for VOT

As a temporal-acoustic measure, VOT is contextually sensitive, and is influenced by numerous

z-vc272-08-鄭明中.indd 196 2013/10/28 上午 11:53:04


factors. For example, studies in VOT have indicated that speaking rate is decisive to the production

of VOT in voiceless long-lag plosives (Beckman et al. 2011; Kessinger and Blumstein 1997, 1998;

Magloire and Green 1999; Pind 1995, 1996; Volaitis and Miller 1992), with faster speaking rate

leading to shorter VOT values. Nonetheless, the effect of speaking rate upon VOT is not obvious in

voiceless short-lag and voiced plosives. Besides, utterance type is reported to affect VOT (Baran,

Laufer, and Daniloff 1977; Lisker and Abramson 1967), with VOT longer in isolated words than

in phrases and sentences. Speech register affects VOT as well, with infant-directed speech having

longer VOT than adult-directed speech (Englund 2005; Sundberg and Lacerda 1999). VOT varies

across fundamental frequency (F0) as well, with longer VOT for voiceless stops spoken at relatively

high F0, compared to VOT spoken at low or medium F0 (McCrea and Morris 2005). Subsequently,

this study will review several affecting factors of VOT decisive to the following discussion (i.e.,

place of articulation, following vowel context, and speakers’ age and gender).5

Place of articulation is by far the most-studied factor for VOT variation. With all else equal,

there is a general pattern existing between stops’ articulatory positions and VOT lengths. Specifically,

the more posterior the point of occlusion is, the longer the VOT will be. Being attributed to universal

phonetic implementation rules conditioned by physiological constraints (Fischer-Jørgensen 1980),

this pattern has been broadly supported from cross-linguistic or language-specific explorations (e.g.,

Abdelli-Beruh 2009; Helgason and Ringen 2008; Jessen and Ringen 2002; Peterson and Lehiste

1960; Riney et al. 2007). For example, Lisker and Abramson (1964) and Cho and Ladefoged (1999)

investigated the stops in 11 and 18 languages respectively, indicating that the commonest sequence

is velars > alveolars > bilabials. Further, Cho and Ladefoged summarized six physiological and

aerodynamic characteristics of speech production to explicate VOT variation resulting from different

articulatory positions of stops.6 Still, not all languages conform to this pattern. For example, in Table

1, bilabial stops have longer VOT than alveolar stops in aspiration (Cantonese, Eastern Armenian and

5 Readers may have a question in mind. Given so many affecting factors, why did the author just choose these four? This was because the restriction of time and financial support of the research precluded the author from taking too many factors into concern. Thus, the author chose the factors that were the most commonly studied across languages, putting aside others for further research. As a matter of fact, even in these factors, little is known of stops’ VOT of Hakka. Hence, to understand the VOT variation under these affecting factors will contribute not only to the VOT realization in Hakka, but to the assistance of foreigners’ learning Hakka. This is the reason why the author selected these factors in this research.

6 The six physiological / aerodynamic characteristics consist of (1) the volume of the cavity behind the point of constriction, (2) the volume of the cavity in front of the point of constriction, (3) movement of articulators, (4) extent of articulatory contact area, (5) change of glottal opening area, and (6) temporal adjustment between closure duration and VOT. For elaborate accounts, please see Cho and Ladefoged (1999, 209-213). For relevant discussion of the VOT production, also see Hardcastle (1973), Maddieson (1997) and Stevens (1998).

z-vc272-08-鄭明中.indd 197 2013/10/28 上午 11:53:04


Mandarin Chinese) or non-aspiration (Tamil, Navajo, Hungarian and Japanese). Dahalo seems to be

the most uncommon language in which alveolar stops come out with the longest VOT. The existence

of the deviant patterns from the general one suggests language specificity in VOT (Gósy 2001).7

Table 1　Summary of the VOT Values Reported in Some Studies

Reference Language [p] [t] [k] [ph] [th] [kh]

Lisker and Abramson

(1964)

Cantonese 9 14 34 77 75 87Tamil 12 8 24 * * *Eastern Armenian 3 15 30 78 59 98

Cho and Ladefoged

(1999)

Navajo 12 6 45 * * *

Dahalo 20 42 27 * * *

Gósy (2001) Hungarian 24.6 23.3 50.2 * * *Riney et al. (2007) Japanese 30 28.5 56.7 * * *Chao and Chen (2008) Mandarin Chinese 14 16 27 82 81 92

Note: All values are in ms.

VOT is also known to vary in different vowel contexts (Docherty 1992; Esposito 2002; Gósy

2001; Higgins, Netsell, and Schulte 1998; Klatt 1975; Port and Rotunno 1979; Rochet and Fei

1991; Smith 1978; Weismer 1979). In terms of vowel height, it has been generally agreed that stops

before high vowels have longer VOT than those before low ones (Klatt 1975; Thornburgh and Ryalls

1998; Yavaş 2002). For example, analyzing 40 males’ and 40 females’ production of English CV

syllables constructed by [p, t, k, b, d, g] in combination with [i, a, u], Morris, McCrea, and Herring

(2008) illustrated that VOT is significantly shorter before [a] than before [i, u].8 This varying

phenomenon results from the different opening degrees of the oral cavity (stricture) in the production

of different vowels and the changes in laryngeal cartilage positioning and vocal fold and vocal tract

tension (Higgins et al. 1998). Moreover, even though the number of studies is not as many as those

associated to vowel height, vowel frontness is reported to affect VOT, too.9 For instance, Gósy (2001)

explored five females’ production of Hungarian voiceless stops [p, t, k] before fourteen vowels in

CV syllables, discovering that [t, k] has longer VOT when preceding front vowels than back vowels,

7 For more thorough discussion on the relation between VOT and stops’ places of articulation, refer to Abdelli-Beruh (2009).

8 A minority of studies did not follow this generally-accepted VOT / vowel pattern. For example, in his investigation of Swedish stops, Fant (1973) reported that aspirated stops have longer VOT in front of [a] than before [i, u].

9 Contradictorily, some studies, like Chao et al. (2006) and Chen, Chao, and Peng (2007), indicated that vowel frontness did not play a role in VOT variation.

z-vc272-08-鄭明中.indd 198 2013/10/28 上午 11:53:04


but [p] reverses the order. Investigating stops in Taiwan Mandarin and Hakka, Peng (2009) observed

that [p] in Taiwan Mandarin and [p, ph] in Hakka have longer VOT before [u] than before [i]. Similar

findings were also observed from Rochet and Fei (1991). For all such observations, neither of the

studies stated above provided full accounts of the reasons for the differences between [p, ph] and

[t, th, k, kh] before front and back vowels. This intricate issue will be explicated in great details in

section 5, in terms of the interaction among VOT, stops’ PoA and vowel context.

In addition, nonlinguistic factors, such as age and gender, are also manipulated to test whether

they affect stops’ VOT. As far as the age factor is concerned, the findings are inconsistent. Some

studies (Neiman, Klich, and Shuey 1983; Petrosino et al. 1993) manifested no significant difference

in different age groups, but old adults produce larger VOT variability and longer syllable duration

than young adults. Others (Morris and Brown 1987; Sweeting and Baken 1982) found that the age

influence emerged merely in voiceless stops, with the elderly producing shortened VOT in specific

stop consonants ([p, t] in Morris and Brown 1987; [p] in Sweeting and Baken 1982). Still others (Liss,

Weismer, and Rosenbek 1990; Ryalls, Simon, and Thomason 2004; Smith, Wasowics, and Preston

1987; Torre and Barlow 2009; Yao 2007) reported that age is a considerable factor of VOT, with a

general tendency that VOT articulated by the young is longer than that produced by the old.10

In terms of the gender factor, massive literature has shed light on this factor, and the common

pattern is that females exhibit longer VOT than males. This pattern has been supported by a variety

of studies of voiceless stops in English (Robb, Gilbert, and Lerman 2005; Ryalls and Zipprer 1997;

Swartz 1992; Sweeting and Baken 1982; Whiteside, Henry, and Dobbin 2004; Whiteside and Irving

1997, 1998; Whiteside and Marshall 2001).11 Smith (1978) and Swartz (1992) reported that females

produced longer VOT than males in English voiced stops, too. There are also a small number of

studies demonstrating no male / female difference in VOT (Morris et al. 2008; Öğüt et al. 2006;

Ryalls et al. 2004; Syrdal 1996), or even longer VOT in males than in females (Oh 2011). Clearly,

previous research in consideration of the effect of speakers’ age and gender on VOT variation has

not yet arrived at a consensus.

In a nutshell, regardless of a large size of literature relevant to the factors of VOT, little is

managed specifically to the VOT variation in Hakka. How the aforementioned factors perform to

10 In comparison with adults, children also produce more variable VOT (Koenig 2000; Smith 1978). 11 It should be noticed that, even in the same studies, the results may be inconsistent, owing to different

segments, tasks or designs (Morris et al. 2008; Smith 1978; Swartz 1992; Whiteside and Irving 1998). For instance, women were reported to produce longer VOT in [p, t, k], but men appeared to produce longer VOT in [b, d, g] (Smith 1978; Whiteside and Irving 1998).

z-vc272-08-鄭明中.indd 199 2013/10/28 上午 11:53:04


affect Hakka stops’ VOT is scarcely explored, and, undoubtedly, is worth further investigation. The

current study intends to achieve this goal and enhance the comprehension of the VOT issue in Hakka

stops. Additionally, VOT variation is a complicated issue. To make it simple and clear, this study

focuses simply on stops in syllable-initial positions. Only when how VOT varies in this position

becomes clear can VOT variation of stops in other linguistic positions or contexts acquire a basis for

further comparisons, either intra-linguistically or inter-linguistically.

3. Methodology

3.1 Participants

Thirty-Six Hakka people (18 males and 18 females) in Miaoli County, with Sixian Hakka as

their mother tongue, joined this study. They were divided into six gender-balanced groups with clear

separation in age (elderly: over 55, middle-aged: 35-55 and younger: 15-35). The mean ages were

25.33/42.67/59.67 years for males and 24.67/41.83/68.83 years for females. The participants were

randomly selected for speech recordings, but they were still required to match the following criteria.

At first, they speak (northern) Sixian Hakka at home and in daily conversation with other Hakka

people.12 Second, they have been living in Gongguan Township and Miaoli City at least for over

fifteen years. Third, they are free from speech, language, hearing or pathological disorders.

3.2 Speech Stimuli

This study targets at exploring the VOT variation of Hakka syllable-initial stops. A list of

36 syllables was used as speech stimuli, as listed in Table 2. These syllables were constructed by

six syllable-initial stops [p, t, k, ph, th, kh] in combination with three corner vowels [i, u, a] and two

high-level tones [55, 55].13 Excluding [tha55] and [ki55] which are lexical gaps (permissible syllables

in Hakka, but currently without lexical meanings), all remaining 34 syllables are real words and

frequently used in their daily lives.

12 Most of the participants, especially the young and the middle-aged groups, can also speak Mandarin, because Mandarin is the standard language used in education and mass media communication. In truth, Hakka people are usually bilingual in Hakka and Mandarin. This is an unavoidable fact that Mandarin has a higher social status. Moreover, some of them can even speak Southern Min, but very influently.

13 Hakka is a tone language, so every syllable bears a citation tone. On the basis of Lo (2007), there are six citation tones in northern Sixian Hakka: Yinping [24], Yangping [11], Shangsheng [31], Qusheng [55], Yinru [32] and Yangru [55], where the numbers in square brackets represent pitch values in terms of Chao’s (1930) five-point scale.

z-vc272-08-鄭明中.indd 200 2013/10/28 上午 11:53:05


Table 2　Speech Stimuli Adopted in This Study

[p][i] [pi55] [pit55]

[ph][i] [phi55] [phit55]

[u] [pu55] [put55] [u] [phu55] [phuk55][a] [pa55] [pat55] [a] [pha55] [phak55]

[t][i] [ti55] [tit55]

[th][i] [thi55] [thit55]

[u] [tu55] [tuk55] [u] [thu55] [thuk55][a] [ta55] [tak55] [a] [tha55] [thak55]

[k][i] [ki55] [kit55]

[kh][i] [khi55] [khip55]

[u] [ku55] [kut55] [u] [khu55] [khuk55][a] [ka55] [kap55] [a] [kha55] [khap55]

The speech stimuli in Table 2 can be classified into two types, checked-tone and unchecked-tone

syllables, depending on whether there are [p, t, k] codas. Appearing with unreleased [p, t, k] codas,

checked-tone syllables are always produced more rapidly, and have shorter syllable duration than

unckecked-tone ones. This length difference also gives rise to different VOT variation, with VOT in

unchecked-tone syllables longer than that in checked-tone syllables (Peng 2009).14 Besides, there are

six citation tones in Sixian Hakka: four unchecked and two checked tones. If the VOT tokens under

all citation tones are collected and averaged, the resulting VOT mean values must be longer than it

will actually be. This is because tokens of unchecked tones are more than those of checked tones. To

remove the bias on the VOT means, this study took in consideration both types of syllables having

the same pitch [55], but differing simply in checked or unchecked types. Hence, this study will not

take the tone factor into discussion.

3.3 Recording Procedure

Speech recording was performed in a sound-proof room in order that high voice qualities could

be guaranteed. A unidirectional microphone (PHILIPS SBC-ME470), with sensitive frequencies

ranging from 50 Hz to 18,000 Hz, was linked to a notebook computer (ACER TravelMate 6252,

with a 2.13G Hz CPU and a 1G SDRAM), and were placed ten centimeters away from the subjects’

mouths at the chest level. Speech samples were recorded directly and digitalized into the computer

by means of PRAAT (Boersma and Weenink 2009), together with a default sampling frequency of

44.1K Hz. The samples were segmented and transformed into Wav files for further analysis.

14 The VOT lengths between unchecked tones and checked tones were compared in Peng (2009), with the result showing that VOT in the former was longer than that in the latter. However, whether there were significant VOT differences among the four unchecked tones in Hakka was not mentioned at all.

z-vc272-08-鄭明中.indd 201 2013/10/28 上午 11:53:05


For subjects to be familiar with the stimuli, the word list had been given to them a week before

sound recording was conducted. In the recording session, subjects were instructed to pronounce

every stimulus with the same pace and intensity. The stimuli were presented randomly. Each stimulus

was pronounced five successive times.15 The middle three were extracted for VOT measurements,

for they are comparatively stable than the first and last ones. Every subject pronounced 108 speech

tokens (36 stimuli multiplied by 3 times), and the total number of tokens for acoustic analysis was

3,888 (108×36).

3.4 Acoustic and Statistical Analysis

The collected speech tokens were measured for VOT values in all cases by visual inspection

and auditory perception, and with the help of oscillograms and (wideband) spectrograms. The

beginning of VOT was defined by the burst of release. The start of waveform periodicity was used

to determine the ending of VOT. In addition, relevant acoustic cues, such as the onset of F1 and F2

of the following vowels, were also used as additional references to demarcate the VOT duration.16

With [phi55] articulated by a middle-aged female as an illustrative example, the VOT measurement in

PRAAT was schematically shown in Figure 1. The portion in red was regarded as the VOT length of

[ph], with the release burst as the onset of VOT (marked by energy release in the waveform) and the

waveform periodicity as the offset of VOT (marked by the first pulse).

Moreover, to assess intra-rater and inter-rater reliability, one-tenth speech tokens were

randomly chosen and re-measured by the author and a research assistant having years’ experience

in conducting acoustic research. Pearson’s correlation coeff icients showed that both measurements

were positively correlated (r = .98 and r = .96 respectively, p < .001). SPSS PASW 18.0 was utilized

for statistical analysis, with a significance level set as p < .05 for all the tests.

4. Results

This section presents the statistical results of the influences upon VOT caused by the four

15 Another method of data collection and recording is to place these speech stimuli in a carrier sentence. However, the method adopted in this current study is also broadly adopted in related VOT studies (e.g., Shi 2008).

16 Different positioning references were adopted to demarcate the ending of VOT, such as the onset of formant 1 (Chao and Chen 2008), of formant 2 (Cho and Keating 2001), or of waveform periodicity (Port and Rotunno 1979; Whalen, Levitt, and Goldstein 2007). Comparing the VOT measurements on the basis of these positioning references, Francis, Ciocca, and Yu (2003) have claimed that it is the most reliable to measure VOT with reference to acoustic waveforms.

z-vc272-08-鄭明中.indd 202 2013/10/28 上午 11:53:05


factors under discussion (i.e., place of articulation (PoA), vowel context, gender and age). Prior

to in-depth accounts, statistical analyses were conducted in advance. A four-way mixed factorial

ANOVA showed no significant VOT differences in nonaspiration (F (8, 594) = .831, p = .576 > .05)

or in aspiration (F (8, 594) = .591, p = .786 > .05). There were still no significant differences in terms

of three-way mixed factorial ANOVAs (p > .05 for all cases). Given this statistical background, the

following discussion will proceed in the sequence: 4.1 PoA; 4.2 following vowel context; 4.3 age; 4.4

gender. Special attention will be directed to the two-way interactive influences of PoA in relation to

other factors.

4.1 Effect of PoA

The mean of VOT for every Hakka stop, averaging over following vowel context, speakers’ age

and gender, are tabulated in Table 3, together with standard deviations (SD), minimums (Min) and

maximums (Max). As previously stated, VOT tends to correlate with stops’ PoA. The more posterior

a stop’s PoA is, the longer the intrinsic VOT will be. This tendency is confirmed in this study. The

table in Table 3 shows velar stops were longer than alveolar stops in VOT; alveolar stops, in turn,

had longer VOT than bilabial stops. When the place of oral constriction moves from the anterior

Figure 1　An Illustrative Example of the VOT Measurement of [ph] in [phi55] Produced by A Middle-

Aged Female

z-vc272-08-鄭明中.indd 203 2013/10/28 上午 11:53:05


to the posterior within the oral cavity, the variability of VOT also increases gradually. This can be

supported by the larger and larger SD values from bilabials to velars.17 Furthermore, in accord with

Cho and Ladefoged’s (1999) classification, Hakka stops belong to the unaspirated and slightly

aspirated categories.

Separate one-way analyses of variance (ANOVAs) exhibited that the PoA effect on stops’ VOT

was significant in either non-aspiration (F (2, 645) = 361.834, p = .000 < .05) or aspiration (F (2,

645) = 106.101, p = .000 < .05). Post hoc least significant difference (LSD) multiple comparisons

indicated that significant differences appeared in [k-t], [k-p], [kh-th], [kh-ph] (p = .000 < .05 for all

cases) and [th-ph] (p = .041 < .05). No significant VOT difference was observed in [t-p] (p = .362 >

.05), in spite of their difference in the absolute VOT value. The lack of VOT length difference in [t-p]

may result from the approximacy of their articulatory positions (i.e., bilabial vs. alveolar). Given this

situation, there must be some other features to distinguish [p] from [t] (e.g., silent period, formant

transition, spectral peak frequency, etc.).

17 The VOT values of Hakka stops in Peng (2009) were 12.64 ms for [p], 15.06 ms for [t], 27.2 ms for [k], 74.22 ms for [ph], 74.28 ms for [th], and 89.94 ms for [kh]. Comparatively, the VOT values obtained in this study were shorter. This difference comes from the number of tones that were taken into concern. Peng collected speech samples of six tones in Hakka, four unchecked tones and two checked tones. Remarkably, VOT in unchecked-tone syllables is longer than that in checked-tone ones. As a result, more unchecked-tone speech samples will definitely lead to longer VOT. If syllables of Qu [55] and Yangru [55] in this study are discussed individually, the VOT values are expressed as follows. For Qu, the VOT values were 11.6 ms for [p], 13.1 ms for [t], 23.8 ms for [k], 58.6 ms for [ph], 59.5 ms for [th], and 79 ms for [kh]. For Yangru, the VOT values were 10.6 ms for [p], 11 ms for [t], 19.7 ms for [k], 46.9 ms for [ph], 50.5 ms for [th], and 65.9 ms for [kh]. The length difference between Qu and Yangru for all stops arrives at statistical significance (p = .000 < .05 for stops other than [p]; p = .013 < .05 for [p]).

Table 3　Means, SDs, and Mins/Maxs of VOT for Individual Hakka Stops

PoAMean(ms)

SD(ms)

Min/Max(ms)

Non-

aspiration

[p] 11.1 3.0 5/23[t] 12.1 3.5 5/26[k] 21.7 6.4 8/40

Aspiration

[ph] 52.8 13.7 21/92

[th] 55.0 13.7 24/96

[kh] 72.4 18.3 39/130

Note: All values are in milliseconds.

ph th kh

z-vc272-08-鄭明中.indd 204 2013/10/28 上午 11:53:06


4.2 Effect of Following Vowel Context

The data in Table 4, averaging over stops uttered in different PoAs, speakers’ age and gender,

are the means and standard deviations of VOT for all Hakka stops followed by [i, a, u]. As far as the

VOT averages are concerned, VOT is the longest before [i], the shortest before [a] and in-between

before [u] (i.e., [i] > [u] > [a]). Likewise, the longer the VOT mean is, the larger VOT variability

there will be.

Table 4　Means, SDs, and Mins/Maxs of Stops’ VOT before [i, u, a]

VowelsMean(ms)

SD(ms)

Min/Max(ms)

[i] 43.9 29.6 5/130

[u] 37.1 24.8 5/98

[a] 32.3 22.9 5/107


A one-way ANOVA was used to test the effect of following vowel context on the VOT

production. This factor was verified to show a highly significant effect on VOT (F (2, 1293) =

18.776, p = .000 < .05). A post hoc LSD multiple comparison indicated significant difference in all

vowel pairs (i.e., [i-u], p = .001 < .05; [i-a], p = .000 < .05; [u-a], p = .007 < .05). VOT varies with

the factor of following vowel context, with an affecting hierarchy of [i] > [u] > [a].

Additionally, the interaction between following vowel context and stops’ PoAs was explored.

The VOT means and standard deviations are arranged in Table 5 and Figure 2 is the schematic

representation. A two-factor repeated measures ANOVA was utilized to examine the interaction of

the two factors, displaying highly significant differences in VOT in following vowel context (F (2,

1278) = 124.642, p = .000 < .05), stops’ PoA (F (5, 1278) = 1437.164, p = .000 < .05), and their

interaction (F (10, 1278) = 11.945, p = .000 < .05). Clearly, according to the statistical results, stops

with more posterior PoAs in combination with higher vowels will result in longer VOT.

z-vc272-08-鄭明中.indd 205 2013/10/28 上午 11:53:06


Table 5　Means and SDs of VOT for Individual Hakka Stops Followed by [i, u, a]

Hakka Stops

VowelsMean(ms)

SD(ms)

Hakka Stops

VowelsMean(ms)

SD(ms)

Hakka Stops

VowelsMean(ms)

SD(ms)

[p][i] 12.3 2.5

[t][i] 14.3 3.0

[k][i] 26.1 6.1

[u] 12.0 3.0 [u] 11.8 3.3 [u] 21.8 5.1[a] 8.9 2.0 [a] 10.2 2.9 [a] 17.2 4.6

[ph][i] 57.7 13.9

[th][i] 62.1 14.0

[kh][i] 86.1 17.2

[u] 53.3 12.7 [u] 53.6 12.2 [u] 70.2 13.6[a] 47.3 12.5 [a] 49.3 11.6 [a] 61.0 14.5


ph th kh

Figure 2　Bar Chart of the Means and SDs of VOT for Individual Hakka Stops Followed by [i, u, a]

The following discussion is divided into two parts, vowels and stops. For vowels, the VOT

hierarchy of velars > alveolars > bilabials mostly remained consistent in either non-aspiration or

aspiration (i.e., [ki] > [ti] > [pi], [khi] > [thi] > [phi], etc.). The only exception was the [pu-tu] pair

in which [p] had slightly longer VOT than [t] in Table 5. A post hoc LSD multiple comparison

also revealed that, excluding [pu-tu] (p = .303 > .05), there were significant VOT differences in all

remaining stop pairs for each vowel (p < .05 for all cases). For stops, the hierarchy [i] > [u] > [a] was

maintained with reference to the VOT means (i.e., [ti] > [tu] > [ta], [phi] > [phu] > [pha], etc.). A post

hoc LSD multiple comparison pointed out that significant differences existed in all vowel pairs ([i-u],

[i-a], [u-a]) for [t, k, ph, th, kh] (p < .05 for all cases). For [p], there existed significant differences in

the pairs of [i-a] and [u-a] (p = .000 < .05), but no significant difference in VOT was observed in the

z-vc272-08-鄭明中.indd 206 2013/10/28 上午 11:53:06


pair of [i-u] (p = .471 > .05). This fascinating issue will be explicated in details in section 5.

4.3 Effect of Age

The means and standard deviations of stops’ VOT made by speakers of different age groups,

averaging over PoA, following vowel context and gender, are listed in Table 6. Obviously, VOT

tends to correlate with speakers’ age. The younger the speakers are, the longer the VOT will be. A

one-way ANOVA pointed out that the age factor had a significant influence on VOT (F (2, 1293)

= 4.128, p = .016 < .05). A follow-up post hoc LSD comparison showed that significant VOT

difference was found between the old and the middle-aged (p = .007 < .05), and between the old

and the young (p = .031 < .05). Between the young and the middle-aged, there was no significant

difference in VOT (p = .580 > .05).

Table 6　Means, SDs, and Mins/Maxs of Stops’ VOT Produced by Different Age Groups

AgeMean(ms)

SD(ms)

Min/Max(ms)

Young (Y) 39.5 26.6 6/120

Middle-aged (M) 38.5 27.5 6/130

Old (O) 34.6 24.2 5/113


To further understand the age effect on stops’ VOT, its interaction with PoAs was investigated.

The means and standard deviations of VOT are arranged in Table 7, and the corresponding figure is

in Figure 3. In terms of the VOT means in Table 7, the hierarchy of the age factor on VOT remained

the same in either Hakka stop (i.e., Y > M > O).

A two-factor repeated measures ANOVA was used to test the interaction between age and stops’

PoA. The result showed significant VOT differences in age (F (2, 1278) = 22.938, p = .000 < .05),

stops’ PoAs (F (5, 1278) = 1176.545, p = .000 < .05), and their interaction (F (10, 1278) = 2.424, p

= .007 < .05). Moreover, for individual stops, the p values resulting from follow-up post hoc LSD

multiple comparisons are in Table 8. In non-aspiration, significant VOT differences only appeared in

[k] between the old and the young and between the middle-aged and the young. In terms of [p] and [t],

z-vc272-08-鄭明中.indd 207 2013/10/28 上午 11:53:06


no significant differences were found between any age groups. In other words, the young produced

[k] with longer VOT than the other two age groups. In aspiration, significant VOT differences appear

between the old and the middle-aged and between the old and the young (p < .05 for all cases) in all

stops. For the young and the middle-aged, there was no significant VOT difference.

Table 7　Means and SDs of VOT for Individual Hakka Stops Produced by Speakers of Different

Age Groups

Hakka Stops

AgeMean(ms)

SD(ms)

Hakka Stops

AgeMean(ms)

SD(ms)

Hakka Stops

AgeMean(ms)

SD(ms)

O 10.8 3.4 O 11.6 3.5 O 19.7 5.7[p] M 10.9 2.5 [t] M 12.1 3.4 [k] M 21.3 6.6

Y 11.6 2.9 Y 12.6 3.5 Y 24.2 6.1O 48.4 13.4 O 49.9 12.1 O 67.3 17.8

[ph] M 53.9 15.0 [th] M 57.7 13.8 [kh] M 75.0 19.6Y 56.0 11.4 Y 57.3 13.7 Y 75.0 16.8


ph th kh

Figure 3 Bar Chart of the Means and SDs of VOT for Individual Hakka Stops Produced by Speakers

of Different Age Groups

z-vc272-08-鄭明中.indd 208 2013/10/28 上午 11:53:06


Table 8　Results of LSD Multiple Comparisons among Different Age Groups for Stops

Stop Age Group p Stop Age Group p Stop Age Group p

[p]

OM .800

[t]

OM .364

[k]

OM .134

Y .093 Y .086 Y .000

MO .800

MO .364

MO .134

Y .153 Y .416 Y .004

YO .093

YO .086

YO .000

M .153 M .416 M .004

[ph]

OM .011

[th]

OM .001

[kh]

OM .011

Y .001 Y .001 Y .011

MO .011

MO .001

MO .011

Y .344 Y .870 Y .996

YO .001

YO .001

YO .011

M .344 M .870 M .996

4.4 Effect of Gender

For the gender factor, the means and standard deviations of stops’ VOT made by speakers of

different genders are displayed in Table 9. The VOT averages and SD values were quite similar in

both genders. A one-way ANOVA was conducted to examine the effect of gender upon stops’ VOT,

but no significant VOT difference was observed (F (1, 1293) = 0.021, p = .864 > .05). To be specific,

the gender factor played no role in the production of VOT in Hakka stops. Despite no statistically

significant difference, this result reflects some sociolinguistic observations between genders,

discussed later.

Table 9　Means, SDs, and Mins/Max of VOT Produced by Speakers of Different Genders

GenderMean(ms)

SD(ms)

Min/Max(ms)

Males (M) 37.3 25.9 6/120

Females (F) 37.7 26.5 5/130

z-vc272-08-鄭明中.indd 209 2013/10/28 上午 11:53:07


Irrespective of the non-significant differences on stops’ VOT as a whole, how the gender factor

interacts with stops of different PoAs is still unknown, awaiting further investigation. The VOT

values of individual stops made by both genders are arranged in Table 10, and Figure 4 gives the

corresponding bar chart. In terms of the VOT means, males produced longer and shorter VOT means

than females in non-aspirated and aspirated stops respectively, but all the differences were slight. A

two-factor repeated measures ANOVA confirmed the non-significant VOT differences in gender (F

(1, 1284) = .284, p = .594 > .05) and in the interaction between gender and PoA (F (5, 1284) = 0.587,

p = .710 > .05). On the contrary, the factor of PoA presented a significant main effect on VOT (F (5,

1284) = 0.587, p = .000 < .05) in both genders’ production of stops with different PoAs (i.e., velars >

alveolars > bilabials).

Table 10　Means and SDs of VOT for Individual Hakka Stops Produced by Speakers of Different

Genders

PoA Gender Mean SD PoA Gender Mean SD

[p]Female 11.0 3.3

[ph]Female 53.4 13.5

Male 11.2 2.6 Male 52.1 13.9

[t]Female 11.9 3.8

[th]Female 55.1 13.0

Male 12.3 3.2 Male 54.9 14.3

[k]Female 21.2 6.8

[kh]Female 73.5 17.5

Male 22.2 3.1 Male 71.4 19.2Note: All values are in milliseconds.

ph th kh

Figure 4　Bar Chart of the Means and SDs of VOT for Individual Hakka Stops Produced by

Speakers of Different Genders

z-vc272-08-鄭明中.indd 210 2013/10/28 上午 11:53:07


5. General Discussion

This study has taken the factors of place of articulation, following vowel context, speakers’ age

and gender into consideration, finding all factors played roles in the VOT variation of Hakka stops,

except gender. Below, this study attempts to offer accounts of the involving factors and relevant

issues to the research discovery.

With relation to the PoA factor, it is well known that VOT varied as a function of the PoAs

of stops on account of the physiological/aerodynamic constraints in speech production (Cho and

Ladefoged 1999). This general pattern is verified in this study as well, according to the VOT means

(i.e., [k] (21.7ms) > [t] (12.1ms) > [p] (11.1ms) and [kh] (72.4ms) > [th] (55.0 ms) > [ph] (52.8ms)).

However, the VOT difference in [t-p] did not arrive at statistical significance, which may result partly

from the shortness of VOT and partly from the closeness of articulatory positions for both stops. In

fact, the same cause may also give rise to no significant VOT difference in the [pu-tu] pair.

As for the factor of following vowel context, VOT shows variances according to following

vowels. This study confirms that stops before [i, u] have longer VOT than those before [a]. High

vowels are produced with a more obstructed cavity and a less abrupt air pressure drop in the oral

cavity than low vowels. When stops in anticipation of high vowels are pronounced, the airstream

escaping from the oral cavity becomes relatively slow, so the release of the airstream (i.e., VOT)

is lengthened. What’s more, the effect of [i] and [u] on VOT is also illustrated to be different, with

[i] stronger than [u] for [t, k, ph, th, kh]. For [p], there is no significant VOT difference between

[pi] and [pu]. As previously mentioned in section 4.2, there is no significant difference in VOT in

the pair of [i-u] (p = .471 > .05). Specifically, vowel frontness seems to play no role in the VOT

production of Hakka syllable-initial stops. In fact, other than [p], the VOT difference between [i]

and [u] in stops can be argued to be crucially relative to vowel height instead of vowel frontness.

As indicated in Ladefoged (2001), vowels specified as [+high] may not have an equal tongue height

in real articulation. This viewpoint is well captured in Cheng’s (2012) acoustical exploration of the

vowel pattern of Sixian Hakka. As shown in Figure 5, front vowels [i, e] are produced with higher

tongue positions than the corresponding back vowels [u, o], no matter which gender is taken into

consideration.18

18 In Figure 5, there is a gender difference in vowel space area, with females’ larger than males’. Moreover, compared with females’ [u], males’ [u] tends to be unstable due to its much wider distribution in vowel space area.

z-vc272-08-鄭明中.indd 211 2013/10/28 上午 11:53:07


(a). Hakka monophthongs [i, e, a, o, u, ö(ii)]

produced by 18 males

(b). Hakka monophthongs [i, e, a, o, u, ö(ii)]

produced by 18 females

Figure 5　Vowel Space Areas Produced by Males and Females of Sixian Hakka

The tongue height difference in [i] and [u] is also supported across languages. In his

investigation of the first formant (F1) of [i] and [u] in 30 languages, de Boer (2010) found that the

mean F1 is higher for [u] than for [i] in 29 out of 30 languages. The F1 values of vowels are known to

be inversely related to tongue height, with open and close vowels having high and low F1 respectively

(Catford 2001; Fry 2001; Stevens 1998). Accordingly, [i] is made with a higher tongue position, and

results in longer VOT of stops, than [u]. Given the difference of vowel height in [i/u], [p]’s VOT in

[pi] should be, in principle, longer than that in [pu]. Why no significant difference of VOT existed in

[pi] and [pu] calls for further accounts. In fact, the effect of [i/u] difference is counterbalanced by the

co-articulation effect resulting from producing the rounded vowel [u]. Compared to [i], [u] is uttered

with lip protrusion which not only lengthens the vocal tract, but prolongs the lag of air release.19

This explains the non-significant difference in VOT between [pi] and [pu]. Nonetheless, one

question occurs in this line of reasoning. If the co-articulation effect resulting from the production

of [u] makes the VOT values in [pi] and [pu] statistically undifferentiated, why does there still exist

a significant VOT difference between [phi] and [phu]? The answer to the question lies in the VOT

length. When VOT is longer to a certain degree, for example, more than [t]’s VOT in this study,

then the co-articulation effect on VOT will be surpassed by the PoA effect. This accounts for the

reason why stops other than [p] explicitly realize the vowel height difference between [i] and [u].

Briefly, vowel height, rather than vowel frontness, plays a decisive role in the VOT production of

Hakka stops. Based on the preceding discussion, subsequent VOT studies should definitely take the

19 For more discussion of the co-articulation of lip rounding, please refer to Daniloff and Moll (1968).

z-vc272-08-鄭明中.indd 212 2013/10/28 上午 11:53:08


difference of vowel height between [i] and [u] into account, and take notice of the co-articulation

effect between [p] and [u].

Regarding speakers’ age and gender, different results were found in this study. As for gender,

there are physiological / anatomical differences between males and females (e.g., Cho and Ladefoged

1999; Docherty 1992; Titze 1994).20 However, the gender effect on VOT could not be statistically

distinguishable.21 This showed that speakers’ gender should not be regarded as a factor associated

with VOT variability. Regardless of non-significant VOT difference between different genders,

there actually emerged a pattern which reflected gender difference, as far as the VOT averages

were concerned. Compared with females, males articulated longer VOT in short-lag stops ([p]:

11.2ms (Male, M)/11.0ms (Female, F), [t]: 12.3ms (M)/11.9ms (F), [k]: 22.2ms (M)/21.2ms (F)),

but shorter VOT in long-lag stops ([ph]: 52.1ms (M)/53.4ms (F), [th]: 54.9ms (M)/55.1ms (F), [kh]:

71.4ms (M)/ 73.5ms (F)). The phonetic VOT contrast between non-aspirated and aspirated stops was

comparatively larger in females than in males. It has been broadly documented in sociolinguistic

research (e.g., Coates 1993; Eckert 1996; Labov 2001; Lakoff 1975; Wodak and Benke 2001) that

females use clearer pronunciation, and speak in a more disciplined manner than males. Thus, the

difference in VOT between short-lag and long-lag stops produced by the two genders, though slight,

completely illustrated such sociolinguistic observations. Besides, it is reported that the perception

of more precise articulation of stops articulated by females is associative to longer VOT or wider

VOT difference between voiced stops and voiceless stops or between non-aspiration and aspiration

(Wadnerkar, Cowell, and Whiteside 2006; Whiteside, Hanson, and Cowell 2004; Whiteside and

Irving 1997; Whiteside and Marshall 2001).22 Undoubtedly, the result of this study well supports

20 Titze (1994) showed that the average vocal fold membranous length is 6 mm shorter in female adults than male adults. The shorter membranous length would increase the possibility of a more rapid closure gesture, resulting in the higher average fundamental frequency (F0) in female speech compared to male speech. Given that VOT is affected by the abduction speed of the vocal folds (Kewley-Port and Preston 1974), a gender bias between male and female plosives would be thus created.

21 Some readers may have a question in mind. If the gender factor did not reach a significant difference in VOT, why did the author need to discuss this issue? To this question, it should be claimed that not all linguistic changes should be statistically supported. For example, according to plenty of sociolinguistic and sociophonetics studies, there are “in-progress change” affected by social factors, like age, gender, educational background, socio-economic status, etc. The changes resulting from the social factors may not arrive at statistically significant difference; yet, it apparently shows a tendency that linguistic items change gradually with some social factors, say, gender or age. Hence, the lack of statistically significant difference in VOT should not preclude the discussion of this issue, especially when it showed a general changing tendency in terms of the gender factor (see Li (2013) for some discussion in this aspect).

22 Additionally, similar to speech rate, speech formality is also an influencing factor for the difference between males’ and females’ VOT lengths. Generally speaking, speakers will pronounce words clearly when speech situations are formal. In the present study, speech data were elicited by speakers’ reading and repeating a list of

z-vc272-08-鄭明中.indd 213 2013/10/28 上午 11:53:08


the sociolinguistic and perceptual view. In fact, the gender difference is also illustrated in the vowel

space patterns in Figure 5, with females’ larger than males’. Given this background of females’ VOT

production, how the foreign spouses in Hakka villages produced stops in Hakka (and Mandarin

Chinese) should be further scrutinized. Because of their diverse language backgrounds, they may

expand or reduce the VOT differences between aspirated and nonaspirated stops, and consequently

give rise to the learning difficulty of stops.

With regard to the age factor, there were significant VOT differences in short-lag and long-

lag stops among different age groups. By and large, the old produced shorter and more variable

VOT than the young and the middle-aged, particularly for long-lag stops (Morris and Brown 1987;

Neiman et al. 1983; Sweeting and Baken 1982). Specifically, the longer the VOT is, the more easily

the age effect reveals. This accounts for why statistically significant VOT differences only surface in

[k, ph, th, kh], but not in [p, t]. The VOT shortness in the aged may lie in the smaller lung volumes of

old speakers than those of other groups of speakers (Hoit, Solomon, and Hixon 1993). Additionally,

there is also reason to believe that the speech production time of elderly adults is relevant to lots of

physical and psychological changes associated with normal aging. For illustration, speech production

performance is influenced by cognitive and memory changes (Ulatowska 1985), and by physical

changes resulting from aging (Kahane 1981).23 Degeneration in the central nervous systems of

elderly adults may give rise to longer muscle contraction time, slower monosynaptic reflex time,

and so like (Kenney 1982; Valenstein 1981; Welford 1977; Whitbourne 1985). These aging and

degenerating processes in the aged are generally believed to give rise to a general slowing of neural

processing, and affecting sensory and motor performance abilities related to speech production (Kent

and Burkard 1981). Definitely, all these factors help clarify the shorter VOT values produced by the

old speakers in this study.

isolated syllables, a pretty formal means for data collection. For isolated syllables, the aerodynamic and articulatory differences between males and females are not great enough to create significant difference. This offers the reason for the slight VOT difference between males and females. If less formal methods (e.g., reading word pairs, reading texts, telling stories, casual conversation) are used for gathering data, larger gender differences in VOT can be expected. For related discussion of the gender factor in VOT, please refer to Morris et al. (2008).

23 Based on Kahane (1981), the physical changes of elderly adults include: (1) reduced biomechanical efficiency of the respiratory, laryngeal, and articulatory systems; (2) disturbances of motor and sensory innervation; and (3) mucosal and biochemical changes.

z-vc272-08-鄭明中.indd 214 2013/10/28 上午 11:53:08


6. Conclusion

To come to a conclusion, this study examined the effect of PoA, following vowel context,

speakers’ age and gender upon the VOT production of Hakka syllable-initial stops. Results indicated

that, except gender, all remaining factors influenced the VOT production of Hakka stops. This

indicated that controlling subjects’ gender in the VOT production, at least for isolated syllables, is

unnecessary in future related research. Yet, in spite of no significant VOT difference in genders, the

result still follows the general sociolinguistic observations that females are more serious in their

pronunciation than males. Finally, surveys like the present one, specific to one language/dialect,

are widely observed across languages, as stated in the introduction, so the present study provides a

general comprehension of the VOT issue in Hakka stops, represents an endeavor to bridge the gap in

the VOT studies of Hakka stops, and puts Hakka into the VOT array of world languages. Moreover,

it establishes a basis for mutual linguistic comparisons between Hakka and foreign languages,

especially for the increasing number of foreign spouses in Hakka villages.24

As already mentioned in section 2, VOT has been displayed to be influenced by complex

factors. Besides the factors explored in this study, future investigation should examine some other

factors, inclusive of word/non-word alternation, syllable position, utterance type (Baran et al.

1977), race (Ryalls and Zipper 1997), language experience (Flege and Eefting 1986), study design

differences (Ryalls and Zipper 1997), speech tempo and rate (Kessinger and Blumstein 1997, 1998;

Theodore, Miller, and DeSteno 2007), dialectal background (Schmidt and Flege 1996; Syrdal 1996),

and so like. In this way, a far further understanding of the VOT issue in Hakka stops can be achieved.

24 One reviewer suggested that more emphases should be put on how the foreign spouses learn Hakka. However, such discussion was not presented here for the following reasons. First, this study aimed only to provide a VOT reference norm for future studies. As is known, most foreign spouses learn Mandarin when moving to Taiwan, even in Hakka villages. Some of the foreign spouses actually learn Hakka, but the aim of the project and time limitation precluded the author from the time-consuming collection of speech data from the foreign spouses. Second, except for Mandarin, the application of acoustic analysis was scarcely seen in Hakka and Southern Min, both of which are not the mainstream dialects in Taiwan. For this reason, the reference norm must be established first; then following studies of related issues will be possible. Remarkably, using acoustic analysis in language teaching / learning frequently occurs in foreign students who learn Mandarin as a foreign or second language, like Cai and Gao (2002), Y. Lin and Wang (2005), Wen et al. (2009), Chung and Shi (2009), Zhuang and Guan (2009) and so on. Comparatively, Hakka is pretty under-investigated in this aspect, and, thus, calls for further endeavors.

z-vc272-08-鄭明中.indd 215 2013/10/28 上午 11:53:08


Acknowledgements

This study was financially sponsored by Hakka Affairs Council (Project No.: 100-03-01-02).

Thanks went to my two research assistants, Yu-Ching Huang and Li-Ping Fu, for their help with

speech data collection, VOT measurement, and statistical analysis. Besides, I also appreciated the

two anonymous reviewers for their insightful comments and constructive suggestions, most of which

were combined into this article. Of course, all remaining errors are mine.

z-vc272-08-鄭明中.indd 216 2013/10/28 上午 11:53:08


References

Abdelli-Beruh, Nassima B. “Influence of Place of Articulation on Some Acoustic Correlates of the

Stop Voicing Contrast in Parisian French,” Journal of Phonetics, 37 (2009): 66-78.

Antoniou, Mark, Catherine T. Best, Michael D. Tyler, and Christian Kroos. “Inter-language

Interference in VOT Production by L2-dominant Bilinguals: Asymmetries in Phonetic Code-

switching,” Journal of Phonetics, 39 (2011): 558-570.

Auzou, Pascal, Canan Ozsancak, Richard J. Morris, Mary Jan, Francis Eustache, and Didier

Hannequin. “Voice Onset Time in Aphasia, Apraxia of Speech Dysarthria: A Review,” Clinical

Linguistics and Phonetics, 14.2 (2000): 131-150.

Baker, Julie, Jack Ryalls, Alejandro Brice, and Janet Whiteside. “Voice Onset Time Production in

Speakers with Alzheimer’s Disease,” Clinical Linguistics and Phonetics, 21 (2007): 859-867.

Baran, Jane A., Marsha Z. Laufer, and Ray Daniloff. “Phonological Contrastivity in Conversation: A

Comparative Study of Voice Onset Time,” Journal of Phonetics, 5 (1977): 339-350.

Beckman, Jill, Petur Helgason, Bob McMurray, and Catherine Ringen. “Rate Effects on Swedish

VOT: Evidence for Phonological Overspecification,” Journal of Phonetics, 39 (2011): 39-49.

Boersma, Paul and David Weenink. PRAAT: Doing Phonetics by Computer (Version 5217)

[Computer software] (Amsterdam, the Netherlands: Institute of Phonetic Sciences, University

of Amsterdam, 2009).

Cai, Zheng-Ying and Wen Gao. “An Analysis of the Pronunciation Errors of Thai Students,” Chinese

Teaching in the World, 60 (2002): 86-92.

Catford, John C. A Practical Introduction to Phonetics (Oxford, UK: Oxford University Press, 2001).

Chao, Kuan-Yi, Ghada Khattab, and Li-Mei Chen. “Comparison of VOT Patterns in Mandarin

Chinese and in English,” in the 4th Annual Hawaii International Conference on Arts and

Humanities, Hawaii, January 11-14, 2006, by Hawaii University.

Chao, Kuan-Yi and Li-Mei Chen. “A Cross-linguistic Study of Voice Onset Time in Stop Consonant

Productions,” Computational Linguistics and Chinese Language Processing, 13.2 (2008): 215-

232.

Chao, Yuen-Ren. “A System of Tone Letters,” Le Maître Phonétique, 45 (1930): 24-27.

Chen, Li-Mei, Kuan-Yi Chao, and Jui-Feng Peng. “VOT Productions of Word-initial Stops in

Mandarin and English: A Cross-language Study,” in the 19th Conference on Computational

Linguistics and Speech Processing, Taipei, September 6-7, 2007, by the Association for

z-vc272-08-鄭明中.indd 217 2013/10/28 上午 11:53:09


Computational Linguistics and Chinese Language Processing.

Cheng, Ming-Chung. “An Acoustic Analysis of the Vowel Pattern in Taiwan Sixian Hakka,” Journal

of Hakka Study, 5.2 (2012): 1-36.

Cho, Taehong and Patricia A. Keating. “Articulatory and Acoustic Studies on Domain-initial

Strengthening in Korean,” Journal of Phonetics, 27 (2001): 207-209.

Cho, Taehong and Peter Ladefoged. “Variation and Universals in VOT: Evidence from 18

Languages,” Journal of Phonetics, 27 (1999): 207-229.

Cho, Taehong, Sun-Ah Jun, and Peter Ladefoged. “Acoustic and Aerodynamic Correlates of Korean

Stops and Fricatives,” Journal of Phonetics, 30 (2002): 193-228.

Chung, Raung-Fu. Contrastive Analysis and Mandarin Teaching (Taipei, Taiwan: Cheng Chung

Bookstore, 2010).

Chung, Raung-Fu and Qiu-Xue Shi. “An Acoustic Contrastive Analysis on an American Learner’s

Mandarin Fricatives,” Journal of Chinese Language Teaching, 6.2 (2009): 129-162.

Coates, Jennifer. Women, Men and Language (New York, NY: Longman, 1993).

Daniloff, Raymond and Kenneth Moll. “Co-articulation of Lip Rounding,” Journal of Speech and

Hearing Research, 11 (1968): 693-706.

de Boer, Bart. “First Formant Difference for /i/ and /u/: A Cross-linguistic Study and an Explanation,”

Journal of Phonetics, 39 (2010): 110-114.

Docherty, Gerard J. The Timing of Voicing in British English (New York, NY: Fortis, 1992).

Eckert, Penelope. “The Whole Woman: Sex and Gender Differences in Variation,” in The Matrix

of Language: Contemporary Linguistic Anthropology, eds. Donald Brenneis and Ronald K. S.

Macaulay (Boulder, CO: Westview Press, 1996), 116-137.

Englund, Kjellrun. “Voice Onset Time in Infant Directed Speech Over the First Six Months,” First

Language, 25.2 (2005): 219-234.

Esposito, Anna. “On Vowel Height and Consonantal Voicing Effects: Data from Italian,” Phonetica,

59 (2002): 197-231.

Fant, Gunnar. Speech Sounds and Features (Cambridge, MA: MIT Press, 1973).

Fischer, Emily and Alexander M. Goberman. “Voice Onset Time in Parkinson Disease,” Journal of

Communication Disorders, 43 (2010): 21-34.

Fischer-Jørgensen, Eli. “Temporal Relations in Danish Tautosyllabic CV Sequences with Stop

Consonants,” Annual Report of the Institute of Phonetics of the University of Copenhagen, 14

(1980): 207-261.

Flege, James E. and Wieke Eefting. “Production and Perception of English Stops by Native Spanish

z-vc272-08-鄭明中.indd 218 2013/10/28 上午 11:53:09


Speakers,” Journal of Phonetics, 15 (1987): 67-83.

Francis, Alexander L., Valter Ciocca, and Man-Ching Yu. “Accuracy and Variability of Acoustic

Measures of Voicing Onset,” Journal of the Acoustical Society of America, 113.2 (2003): 1025-

1032.

Fry, Dennis B. The Physics of Speech (Cambridge, MA: Cambridge University Press. 2001).

Gamkerlidze, Thomas V. “On the Correlation of Stops and Fricatives in a Phonological System,”

in Universals of Human Language vol.2 Phonology, ed. Joseph H. Greenberg (Stanford, CA:

Stanford University Press, 1978), 9-46.

Gao, Mei-Shu. “HanHan seyin/secayin de duibi shiyan yanjiu” (An Experimental Study of

Contrasting Stops and Affricates between Mandarin Chinese and Korean), Hanyu Xuexi (Chinese

Language Learning), 4 (2001): 51-54.

Gósy, Maria. “The VOT of the Hungarian Voiceless Plosives in Words and in Spontaneous Speech,”

International Journal of Speech Technology, 4 (2001): 75-78.

Han, Mieko S. and Raymond S. Weitzman. “Acoustic Features of Korean/P, T, K/, /p, t, k/ and /ph,

th, kh/,” Phonetica, 22 (1970): 112-128.

Hardcastle, William J. “Some Observations on the Tense-lax Distinction in Initial Stops in Korean,”

Journal of Phonetics, 1 (1973): 263-271.

Hazan, Lalerie L. and Georges Boulakia. “Perception and Production of a Voicing Contrast by

French-English Bilinguals,” Language and Speech, 36 (1993): 17-39.

Helgason, Petur and Catherine Ringen. “Voicing and Aspiration in Swedish Stops,” Journal of

Phonetics, 36 (2008): 607-628.

Henton, Caroline, Peter Ladefoged, and Ian Maddieson. “Stops in the World’s Languages,”

Phonetica, 49 (1992): 65-101.

Higgins, Maureen B., Ronald Netsell, and Laura Schulte. “Vowel-related Difference in Laryngeal

Articulatory and Phonatory Function,” Journal of Speech, Language, and Hearing Research, 41

(1998): 712-724.

Hoit, Jeannette D., Nancy P. Solomon, and Thomas J. Hixon. “Effect of Lung Volume on Voice

Onset time,” Journal of Speech, Language, and Hearing Research, 36 (1993): 516-520.

Jäncke, Lutz. “Variability and Duration of Voice Onset Time and Phonation in Shuttering and

Nonshuttering Adults,” Journal of Fluency Disorders, 19.1 (1994): 21-37.

Jessen, Michael and Catherine Ringen. “Laryngeal Features in German,” Phonology, 19 (2002): 189-

218.

Kahane, Joel C. “Anatomic and Physiological Changes in the Aging Peripheral Speech Mechanism,”

z-vc272-08-鄭明中.indd 219 2013/10/28 上午 11:53:09


in Aging: Communication Processes and Disorders, eds. Daniel S. Beasley and G. Albyn Davis

(New York, NY: Grune and Stratton, 1981), 21-45.

Kang, Kyoung-Ho and Susan G. Guion. “Phonological Systems in Bilinguals: Age of Learning

Effects on the Stop Consonant Systems of Korean-English bilinguals,” Journal of the Acoustical

Society of America, 119 (2006): 1672-1683.

Keating, Patricia, Wendy Linker, and Marie Huffman. “Patterns in Allophone Distribution for Voiced

and Voiceless Stops,” Journal of Phonetics, 11 (1983): 277-290.

Kehoe, Margaret, Conxita Lleó, and Martin Rakow. “Voice Onset Time in Bilingual German-Spanish

Children,” Bilingualism: Language and Cognition, 7.1 (2004): 71-88.

Kenney, Richard A. Physiology of Aging: A Synopsis (Chicago, IL: Year Book Medical Publishers,

1982).

Kent, Ray D. and Robert Burkard. “Changes in Acoustic Correlates of Speech Production,” in Aging:

Communication Processes and Disorders, eds. Daniel S. Beasley and George. A. Davis (New

York, NY: Grune and Stratton, 1981), 47-62.

Kessinger, Rachel and Sheila E. Blumstein. “Effects of Speaking Rate on Voice-onset Time in Thai,

French, and English,” Journal of Phonetics, 25 (1997): 143-168.

Kessinger, Rachel and Sheila E. Blumstein. “Effects of Speaking Rate on Voice-onset Time and

Vowel Production: Some Implications for Perception Studies,” Journal of Phonetics, 26 (1998):

117-128.

Kewley-Port, Diane and Malcolm S. Perston. “Early Apical Stop Production: A Voice Onset Time

Analysis,” Journal of Phonetics, 2 (1974): 195-210.

Khattab, Ghada. “VOT Production in English and Arabic Bilingual and Monolingual Children,”

Leeds Working Paper in Linguistics and Phonetic, 8 (2000): 95-122.

Khouw, Edward and Valter Ciocca. “An Acoustic and Perceptual Study of Initial Stops Produced by

Profoundly Hearing Impaired Adolescents,” Clinical Linguistics and Phonetics, 21.1 (2007):

13-27.

Klatt, Dennis. “Voice Onset Time, Frication, and Aspiration in Word-initial Consonant Clusters,”

Journal of Speech, and Hearing Research, 18 (1975): 686-706.

Koenig, Laura L. “Laryngeal Factors in Voiceless Consonant Production in Men, Women, and

5-year-olds,” Journal of Speech, Language, and Hearing Research, 43 (2000): 1211-1228.

Kong, Eun Jong, Mary E. Beckman, and Jana Edwards. “Why are Korean Tense Stops Acquired So

Early?: The Role of Acoustic Properties,” Journal of Phonetics, 39 (2011): 196-211.

Labov, William. Principles of Linguistic Change: Social Factors (Oxford, UK: Blackwell, 2001).

z-vc272-08-鄭明中.indd 220 2013/10/28 上午 11:53:09


Ladefoged, Peter. A Course in Phonetics (4th ed.) (Boston, MA: Heinle and Heinle, 2001).

Lakoff, Robin. Language and Women’s Place (New York, NY: Harper and Row, 1975).

Li, Fang-Fang. “The Effect of Speakers’ Sex on Voice Onset Time in Mandarin Stops,” Journal of

the Acoustical Society of America, 133.2 (2013): 142-147.

Liang, Chiou-Wen. “The Acoustic Characteristics of Hakka Consonants and Vowels” (Master thesis,

National Kaohsiung Normal University, 2005).

Liang, Lei. “Shegen seyin shengxue tezheng chutan” (A Preliminary Acoustic Exploration of Velar

Stops), Journal of Baoding Teachers College, 14.1 (2001): 67-69.

Liao, Shu-Jong. “Interlanguage Production of English Stop Consonants: A VOT Analysis” (Master

thesis, National Kaohsiung Normal University, 2005).

Lin, Tao and Li-Jia Wang. Yunyinxue jiaocheng (A Course in Phonetics) (Taipei, Taiwan: Wunan,

2005).

Lin, Yi-Gao and Gong-Ping Wang. “Yinni liuxuesheng xide Hanyu seyin han secayin shiyan yanjiu”

(An Experimental Study on the Acquisition of Stops and Affricates in Mandarin Chinese by

Indonesian Learners), Yunyan Jiaoxue yu Yanjiu (Language Teaching and Linguistic Studies), 4

(2005): 59-65.

Lisker, Leigh and Arthur S. Abramson. “A Cross-language Study of Voicing in Initial Stops:

Acoustical Measurements,” Word, 20 (1964): 384-422.

Lisker, Leigh and Arthur S. Abramson. “Some Effects of Context on Voice Onset Time in English

Stops,” Language and Speech, 10 (1967): 1-28.

Liss, Julie M., Gary Weismer, and John C. Rosenbek. “Selected Acoustic Characteristics of Speech

Production in Very Old Males,” Journal of Gerontology, 45 (1990): 35-45.

Liu, Hanjun, Manwa L. Ng, Minxi Wan, Supin Wang, and Yi Zhang. “Effects of Place of Articulation

and Aspiration on Voice Onset Time in Mandarin Esophageal Speech,” Folia Phoniatrica et

Logopaedica, 59 (2007): 147-154.

Liu, Hanjun, Manwa L. Ng, Minxi Wan, Supin Wang, and Yi Zhang. “The Effect of Tonal Changes

on Voice Onset Time in Mandarin Esophageal Speech,” Journal of Voice, 22.2 (2008): 210-218.

Lo, Seogim. Chongxiu Miaolixian zhi: Yuyan zhi (Reedited Recording of Miaoli County: Language)

(Miaoli, Taiwan: Miaoli County Government, 2007).

Maddieson, Ian. Patterns of Sounds (Cambridge, UK: Cambridge University Press, 1984).

Maddieson, Ian. “Phonetic Universals,” in The Handbook of Phonetic Sciences, eds. John Laver and

William J. Hardcastle (Oxford, UK: Blackwell, 1997), 619-639.

Maddieson, Ian and Peter Ladefoged. Sounds of the World’s Languages (Oxford, UK: Blackwell,

z-vc272-08-鄭明中.indd 221 2013/10/28 上午 11:53:09


1996).

Magloire, Joel and Kerry P. Green. “A Cross-language Comparison of Speaking Rate Effects on the

Production of Voice Onset Time in English and Spanish,” Phonetica, 56 (1999): 158-185.

McCrea, Christopher R. and Richard J. Morris. “The Effects of Fundamental Frequency Level on

Voice Onset Time in Normal Adult Male Speakers,” Journal of Speech, Language and Hearing

Research, 48 (2005): 1013-1024.

Morris, Richard J. and W. Samuel Brown. “Age-related Voice Measures among Adult Women,”

Journal of Voice, 1 (1987): 38-43.

Morris, Richard J., Christopher R. McCrea, and Kaileen D. Herring. “Voice Onset Time Differences

between Adult Males and Females: Isolated Syllables,” Journal of Phonetics, 36 (2008): 308-

317.

Neiman, Gary S., Richard J. Klich, and Elaine M. Shuey. “Voice Onset Time in Young and 70-year-

old Women,” Journal of Speech and Hearing Research, 26 (1983): 118-123.

Nielsen, Kuniko. “Specificity and Abstractness of VOT Imitation,” Journal of Phonetics, 39 (2011):

132-142.

Ng, Manwa L. and Juliana Wong. “Voice Onset Time Characteristics of Esophageal, Tracheo-

esophageal, and Laryngeal Speech of Cantonese,” Journal of Speech, Language, and Hearing

Research, 52 (2009): 780-789.

Ogasawara, Naomi. “Acoustic Analysis of Voice-onset Time in Taiwan Mandarin and Japanese,”

Concentric: Studies in Linguistics, 37.2 (2011): 155-178.

Öğüt, Fatih, Mehmet A. Kiliç, Erkan Zeki Engin, and Rasit Midilli. “Voice Onset Times for Turkish

Stop Consonants,” Speech Communication, 48 (2006): 1094-1099.

Oh, Eunjin. “Effects of Speaking Gender on Voice Onset Time in Korean,” Journal of the Acoustical

Society of America, 125.4 (2011): 2574.

Peng, Jui-Feng. “Factors for Voice Onset Time: Stops in Mandarin and Hakka” (Master thesis,

National Cheng Kung University, 2009).

Peterson, Gordon E. and Ilse Lehiste. “Duration of Syllable Nuclei in English,” Journal of the

Acoustical Society of America, 32 (1960): 693-703.

Petrosino, Linda, Roger D. Colcord, Karen B. Kurcz, and Robert J. Yonker. “Voice Onset Time of

Velar Stop Productions in Aged Speakers,” Percept Mot Skills, 76.1 (1993): 83-88.

Pind, Jorgen. “Speaking Rate, Voice-onset Time and Quantity: The Search for Higher-order

Invariants for Two Icelandic Speech Cues,” Perception and Psychophysics, 57 (1995): 291-304.

Pind, Jorgen “Rate-dependent Perception of Aspiration and Pre-aspiration in Icelandic,” Quarterly

z-vc272-08-鄭明中.indd 222 2013/10/28 上午 11:53:09


Journal of Experimental Psychology, 49A.3 (1996): 745-764.

Port, Robert F. and Rosemaire Rotunno. “Relation Between Voice-onset Time and Vowel Duration,”

Journal of the Acoustical Society of America, 66.3 (1979): 654-662.

Ran, Qi-Bin. Fuyin xianxiang yu fuyin texing: Jiyu putonghua de Hanyu zuse fuyin shiyan yanjiu

(Phonetic Phenomena and Characteristics of Consonants: An Experimental Study Based on the

Obstruents in Standard Chinese) (Tianjin, China: Nankai University Press, 2008).

Ren, Hong-Mo. “Beijinghua seyin de yanjiu” (Study of Stops in Beijing Mandarin) (Master thesis,

Chinese Academy of Social Sciences, 1981).

Riney, T. James and Naoyuki Takagi. “Global Foreign Accent and Voice Onset Time among Japanese

EFL Speakers,” Language Learning, 49.2 (1999): 275-302.

Riney, Timothy James, Naoyuki Takagi, Kaori Ota, and Yoko Uchida. “The Intermediate Degree of

VOT in Japanese Initial Voiceless Stops,” Journal of Phonetics, 35 (2007): 439-443.

Ringen, Catherine and Kari Suomi. “The Voicing Contrast in Fenno-Swedish Stops,” Journal of

Phonetics, 40 (2012): 419-429.

Robb, Michael, Harvey Gilbert, and Jay Lerman. “Influence of Gender and Environmental Setting on

VOT,” Folia Phoniatrica and Logopaedica, 57 (2005): 125-133.

Rochet, Bernard L. and Yanmei Fei. “Effect of Consonant and Vowel Context on Mandarin Chinese

VOT: Production and Perception,” Canadian Acoustics, 19.4 (1991): 105.

Rosner, Burton S., Luis E. López-Bascuas, Jose E. García-Albea, and Richard P. Fahey. “Voice-onset

Times for Castilian Spanish Initial Stops,” Journal of Phonetics, 28 (2000): 217-224.

Ryalls, Jack and Allison Zipprer. “A Preliminary Investigation of the Effects of Gender and Race on

Voice Onset Time,” Journal of Speech and Hearing Research, 40.3 (1997): 642-645.

Ryalls, Jack, Mami Simon, and Jerry Thomason. “Voice Onset Time Production in Older Caucasian-

and African-Americans,” Journal of Multilingual Communication Disorders, 2.1 (2004): 61-67.

Schmidt, Anna Maria and James Emil Flege. “Speaking Rate Effects on Stops Produced by Spanish

and English Monolinguals and Spanish / English Bilinguals,” Phonetica, 53.3 (1996): 162-179.

Shi, Feng. Yuying geju (The Pattern of Sounds: An Integration of Phonetics and Phonology)

(Beijing, China: Commercial Press, 2008).

Shi, Feng and Rong-Rong Liao. “Zhongmei xuesheng hanyu seyin shizhi duibi fenxi” (Contrastive

analysis of the length of stops in Mandarin between Chinese and American students), Language

Teaching and Linguistic Studies, 4 (1986): 67-83.

Shimizu, Katsumasa. A Cross-language Study of Voicing Contrasts of Stop Consonants in Asian

Languages (Tokyo, Japan: Seibido Publishing, 1996).

z-vc272-08-鄭明中.indd 223 2013/10/28 上午 11:53:09


Simon, Ellen. “Child L2 Development: A Longitudinal Case Study on Voice Onset Times in Word-

initial Stops,” Journal of Child Language, 37 (2010): 159-173.

Smith, Bruce L. “Effects of Place of Articulation and Vowel Environment on Voiced Stop Consonant

Production,” Glossa, 12 (1978): 129-134.

Smith, Bruce L., Jan Wasowicz, and Judy Preson “Temporal Characteristics of the Speech of Normal

Elderly Adults,” Journal of Speech and Hearing Research, 30 (1987): 522-529.

Stevens, Kenneth N. Acoustic Phonetics (Cambridge, MA: MIT Press, 1998).

Sundberg, Ulla and Francisco Lacerda. “Voice Onset Time in Speech to Infants and Adults,”

Phonetica, 56 (1999): 186-199.

Swartz, Bradford L. “Gender Difference in Voice Onset Time,” Perceptual and Motor Skills, 75

(1992): 983-992.

Sweeting, Patricia M. and Ronald J. Baken. “Voice Onset Time in a Normal-aged Population,”

Journal of Speech, and Hearing Research, 25 (1982): 129-134.

Syrdal, Ann K. “Acoustic Variability in Spontaneous Conversational Speech of American English

Talkers,” in Proceedings of 4th International Conference of Spoken Language Processing, eds.

H. Timothy Bunnell and William Iolsardi (Newark, DE: University of Delaware Alfred I duPont

Institute. 1996), 438-441.

Theodore, Rachel M., Joanne L. Miller, and David DeSteno. “The Effect of Speaking Rate on Voice

Onset Time Is Talker Specific,” Saarbrucken, 8 (2007): 6-10.

Thornburgh, Dianne F. and John H. Ryalls. “Voice Onset Time in Spanish-English Bilinguals: Early

Versus Late Learners of English,” Journal of Communication Disorder, 31 (1998): 215-229.

Titze, Ingo R. Principles of Voice Production (Englewood Cliffs, NJ: Prentice-Hall, 1994).

Torre, Peter and Jessica A. Barlow. “Age-related Changes in Acoustic Characteristics of Adult

Speech,” Journal of Communication Disorders, 42 (2009): 324-333.

Tsao, Feng-Ming. “Fuyinde shengxue texing” (Acoustic Characteristics of Consonants), in Yuyan

binglixue jichu Dier juan (Foundathons of Language Pathology, Vol. 2), ed. Chin-Hsing Tseng

(Taipei, Taiwan: Psychological Publishing Co., 1996), 32-66.

Tseng, Chiu-Yu. “An Investigation on Voice Onset Time and Tone Production of Chinese Aphasia,”

Bulletin of the Institute of History and Philology, Academia Sinica, 65.1 (1994): 37-79.

Tsui, Ida Y.-H. and Valter Ciocca. “Perception of Aspiration and Place of Articulation of Cantonese

Initial Stops by Normal and Sensorineural Hearing-impaired Listeners,” International Journal

of Language and Communication Disorders, 35.4 (2000): 507-525.

Ulatowska, Hanna K. The Aging Brain: Communication in the Elderly (San Diego, CA: College-Hill

z-vc272-08-鄭明中.indd 224 2013/10/28 上午 11:53:09


Press, 1985).

Valenstein, Edward. “Age-related Changes in the Human Central Nervous System,” in Aging:

Communication Processes and Disorders, eds. Daniel S. Beasley and G. Albyn Davis (New

York, NY: Grune and Stratton, 1981), 47-62.

Volaitis, Lydia E. and Joanne L. Miller. “Phonetic Prototypes: Influence of Place of Articulation

and Speaking Rate on the Internal Structure of Voicing Categories,” Journal of the Acoustical

Society of America, 92 (1992): 723-735.

Wadnerkar, Meghana B., Patricia E. Cowell, and Sandra P. Whiteside. “Speech Across the Menstrual

Cycle: A Replication and Extension,” Neuroscience Letters, 408 (2006): 21-24.

Wang, Mei-Jung, Hsueh-Chu Chen, and Ling-Ling Tsai. “An Acoustic Analysis of Chinese and

English Consonants Stops Produced by Native Speakers and EFL Learners,” Journal of Foreign

Language Instruction, 1.1 (2006): 125-138.

Wang, Yun-Jia and Xue-Na Shangguan. “Riben xuexizhe dui hanyu Putonghua busongqi / songqi

fuyin de jiagong” (Production of Unaspirated / Aspirated Consonants of Japanese Learners of

Standard Chinese), Chinese Teaching in the World, 70 (2004): 54-66.

Weismer, Gary. “Sensitivity of Voice-onset Time Measures to Certain Segmental Features in Speech

Production,” Journal of Phonetics, 7 (1979): 194-204.

Welford, Alan T. “Motor Performance,” in Handbook of the Psychology of Aging, eds. James E.

Birren and K. Warner Schaie (New York, NY: Van Nostrand Reinhold, 1977), 450-496.

Wen, Bao-Yin, Qi-Bin Ran, and Feng Shi. “Deguo xuesheng xide Hanyu seyin shengmu de chubu

fenxi” (A Preliminary Study of the Chinese Stop Initials Produced by German Learners),

Journal of Yunnan Normal University, 7.4 (2009): 54-61.

Whalen, D. H., Andrea G. Levitt, and Louis M. Goldstein. “VOT in the Babbling of French- and

English-learning Infants,” Journal of Phonetics, 35 (2007): 341-352.

Whitbourne, Susan Krauss. The Aging Body: Physiological Changes and Psychological

Consequences (New York, NY: Springer-Verlag, 1985).

Whiteside, Sandra P., Anna Hanson, and Patricia Cowell. “Hormones and Temporal Components of

Speech: Sex Differences and Effects of Menstrual Cyclicity on speech,” Neuroscience Letters,

367 (2004): 44-47.

Whiteside, Sandra P. and Caroline J. Irving. “Speaker’s Sex Differences in Voice Onset Time: Some

Preliminary Findings,” Perception and Motor Skills, 85 (1997): 459-463.

Whiteside, Sandra P. and Caroline J. Irving. “Speaker’s Sex Differences in Voice Onset Time: A

Study of Isolated Word Production,” Perceptual and Motor Skills, 86 (1998): 651-654.

z-vc272-08-鄭明中.indd 225 2013/10/28 上午 11:53:10


Whiteside, Sandra P. and John Marshall. “Developmental Trends in Voice Onset Time: Some

Evidence for Sex Differences,” Phonetica, 58 (2001): 196-210.

Whiteside, Sandra P., Luisa Henry, and Rachel Dobbin. “Sex Difference in Voice Onset Time: A

Developmental Study of Phonetic Context Effects in British English,” Journal of the Acoustical

Society of America, 116.2 (2004): 1179-1183.

Williams, Lee. “The Voicing Contrast in Spanish,” Journal of Phonetics, 5 (1977): 169-184.

Wodak, Ruth and Gertruad Benke. “Gender as a Sociolinguistic Variable: New Perspectives on

Variation Studies,” in The Handbook of Sociolinguistics, ed. Florian Coulmas (Oxford, UK:

Wiley-Blackwell, 2001), 127-149.

Wu, Hsiao-Lin. “Stops and Affricates in Mandarin Chinese and Hakka: A VOT Analysis” (Master

thesis, National Kaohsiung Normal University, 2009).

Wu, Zong-Ji. “Putonghua fuyin busongqi/songqi qubie de shiyan yanjiu” (An experimental study of

the non-aspiration/aspiration distinction in Standard Chinese), in Wuzongji Yuyanxue Lunwenji

(Collection of Zhong-ji. Wu’s Linguistic Research Papers), ed. Zong-Ji Wu (Beijing, China:

Commercial Press, 2004), 31-65.

Wu, Zong-Ji and Mao-Can Lin. “Shiyan Yuyinxue Gaiyao” (An Outline of Experimental Phonetics)

(Beijing, China: Higher Education Press, 1989).

Yao, Yao. “Closure Duration and VOT of Word-initial Voiceless Plosives in English in Spontaneous

Connected Speech,” UC Berkeley Phonology Lab Annual Report (2007): 183-225.

Yavaş, Mehmet. “Voice Onset Time Patterns in Bilingual Phonological Development,” in

Investigations in Clinical Phonetics and Linguistics, eds. Fay Windsor, M. Louise Kelly and

Nigel Hewlett (Mahwah, NJ: Lawrence Erlbaum Associates, 2002), 341-350.

Yavaş, Mehmet and Renee Wildermuth. “The Effects of Place of Articulation and Vowel Height in

the Acquisition of English Aspirated Stops by Spanish Speakers,” IRAL, 44 (2006): 251-263.

Zhuang, Jie and Ying-Wei Guan. “An Experimental Study and Error Analysis of the Acquisition of

Stops and Affricates in Chinese by Vietnamese Learners,” Journal of Yunmeng, 30.2 (2009):

144-147.

Zlatin, Marsha A. “Voicing Contrast and Voice Onset Time,” Journal of the Acoustical Society of

America, 56.3 (1974): 981-987.

z-vc272-08-鄭明中.indd 226 2013/10/28 上午 11:53:10


四縣客家話音節首塞音嗓音起始時長之研究：孤立音節

鄭明中國立聯合大學

客家語言與傳播研究所副教授

本研究主要目的在於探討四縣客家話孤立音節的音節首塞音，在不同發音位置、不同後

接元音、不同說話者性別及年齡等因素作用下，所產生的嗓音起始時長（Voice Onset Time,

VOT）的變異，為客家話VOT在這些因素影響下建立參照標準。36位苗栗在地的客家鄉親參

與本項研究，依據他們的性別、年齡不同，共分成六組。他們唸讀一組由客家話塞音[p, t, k,

ph, th, kh]搭配[i, a, u]及兩個高調所形成的36個字詞，隨後並利用PRAAT進行切音與塞音VOT

測量。研究結果顯示，除了性別之外，其他因素都對客語塞音的VOT達到顯著的影響。雖然

性別因素並未達到統計上的顯著差異水準，但若從VOT絕對值來看，兩性之間呈現出明顯不

同，這與社會語言學研究中對於兩性的觀察不謀而合。此外，本研究更特別對塞音發音位置

與後接元音之間的互動關係進行深入的論述，說明元音的高低對VOT的影響大於元音的前

後。總結而言，本研究的貢獻在於以較大樣本數的分析來瞭解客家話VOT及其變化，並同時

為未來客家話與漢語方言或其他語言在VOT的比較上建立參照基礎，特別是對比客家話與客

庄中不同國籍外籍配偶的語言。

關鍵詞：年齡、性別、客家話、塞音、嗓音起始時長

通訊作者：鄭明中，E-mail: [email protected]收稿日期：2013/02/25；修正日期：2013/04/11；接受日期：2013/04/12。doi: 10.6210/JNTNULL.2013.58(2).08

z-vc272-08-鄭明中.indd 227 2013/10/28 上午 11:53:10

z-vc272-08-鄭明中.indd 228 2013/10/28 上午 11:53:10

Download - Voice Onset Time of Syllable-Initial Stops in Sixian …jntnu.ord.ntnu.edu.tw/Uploads/Papers/635222655661200000.pdfMing-Chung Cheng Voice Onset Time of Syllable-Initial Stops in Sixian

Top Related