investigating l1 transfer in l2 speech perception ...lc.hkbu.edu.hk/book/pdf/v14_04.pdf ·...

35
HKBU Papers in Applied Language Studies Vol. 14, 2010 Investigating L1 Transfer in L2 Speech Perception: Evidence from Vietnamese Speakers’ Perception of English Vowel Contrasts * Qin Chuan Hong Kong Baptist University Abstract To investigate the effect of L1 transfer in L2 speech perception, an identification test of four RP vowel contrasts (/i/-//, /u/-//, //-//, /e/-/æ/) is administered to 15 native Vietnamese speakers. Two of the contrasts (/i/-//, /u/-//) are not phonologically contrastive in Vietnamese, and the other two contrasts (//-//, /e/-/æ/) have phonological counterparts in Vietnamese. A supplementary production test of the same vowel contrasts is also given to the subjects. Based on an analysis of the results and a phonological and phonetic comparison of L1 and L2, the following conclusions are arrived at: (1) L1 transfer is an important factor in L2 speech perception, but its effect varies among different contrasts. (2) L1 transfer can occur at both phonological level and phonetic level, and different level of transfer can lead to different perceptual performance. 1. Introduction A core issue in the studies of L2 speech perception is L1 transfer. Most of the well-known models for L2 speech perception, e.g. Perception Assimilation Model (Best, 1995), Speech Learning Model (Flege, 1995), Native Language Magnet Model (Iverson & Kuhl, 1995), are all based on similarity and dissimilarity of native language and target language. According to these models, the relative difference between L1 and L2 can

Upload: trankhanh

Post on 24-May-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

HKBU Papers in Applied Language Studies Vol. 14, 2010

Investigating L1 Transfer in L2 Speech Perception: Evidence from Vietnamese Speakers’

Perception of English Vowel Contrasts*

Qin Chuan Hong Kong Baptist University

Abstract

To investigate the effect of L1 transfer in L2 speech perception, an identification test of four RP vowel contrasts (/i/-//, /u/-//, //-//, /e/-/æ/) is administered to 15 native Vietnamese speakers. Two of the contrasts (/i/-//, /u/-//) are not phonologically contrastive in Vietnamese, and the other two contrasts (//-//, /e/-/æ/) have phonological counterparts in Vietnamese. A supplementary production test of the same vowel contrasts is also given to the subjects. Based on an analysis of the results and a phonological and phonetic comparison of L1 and L2, the following conclusions are arrived at: (1) L1 transfer is an important factor in L2 speech perception, but its effect varies among different contrasts. (2) L1 transfer can occur at both phonological level and phonetic level, and different level of transfer can lead to different perceptual performance.

1. Introduction A core issue in the studies of L2 speech perception is L1 transfer. Most of the well-known models for L2 speech perception, e.g. Perception Assimilation Model (Best, 1995), Speech Learning Model (Flege, 1995), Native Language Magnet Model (Iverson & Kuhl, 1995), are all based on similarity and dissimilarity of native language and target language. According to these models, the relative difference between L1 and L2 can

Qin: Investigating L1 Transfer in L2 Speech Perception

74

determine the performance in L2 speech perception. Indeed, problems in the perception of non-native sounds have been extensively observed in the previous L2 perception studies (Beddor & Strange, 1982; Best & Strange, 1992; Flege & Eefting, 1988; Gottfried & Beddor, 1988; Rochet, 1995; Sung, 2005), and most of these studies tend to attribute the problems to L1 transfer. However, cases in which L1 transfer cannot account for the perceptual performance are also occasionally reported (Bohn, 1995; Bohn & Flege, 1990). For example, Bohn (1995) proposes that non-native listeners may prefer to a certain type of acoustic cue in L2 speech perception regardless what their L1 is. The seemingly divergent views, together with the discrepancies in experiment results, make L1 transfer a complicated issue in the studies of L2 speech perception. More importantly, most of the previous studies only focus on the perception of non-native contrasts (i.e. contrasts without phonological counterpart in L1), but very few studies test the contrasts which have phonological counterparts in L1. If L1 is the decisive factor in L2 speech perception, it is assumed that the contrasts with phonological counterparts in L1 should pose little difficulty to non-native listeners. If those contrasts are perceived problematically by non-native listeners, it would not be sensible to attribute the difficulties in non-native contrasts to L1 transfer. This intriguing fact gives rise to the research question in this study, stated below.

(1) Research question

How does L1 affect the perception of L2 contrasts, especially the contrasts having phonological counterparts in L1?

To answer the research question, an identification test of four Received Pronunciation (RP) 1 vowel contrasts (/i/-//,

Qin: Investigating L1 Transfer in L2 Speech Perception

75

/u/-//, //-//, /e/-/æ/) is administered to 15 native Vietnamese speakers. Of the four contrasts, /i/-// and /u/-// are not phonologically contrastive in Vietnamese, but the other two contrasts, i.e. //-// and /e/-/æ/, have phonological counterparts in Vietnamese. The stimuli are synthetic pure vowels (details in Section 3.1). Because both spectral cues and temporal cues are exploited by English and Vietnamese to distinguish vowels, the lengths of the vowels (except /e/ and /æ/, since length contrast is absent in the RP /e/-/æ/ contrast) are manipulated to obtain length contrast. In the test, the subjects are asked to identify the stimuli they hear as the vowels in proper English words. A follow-up production test containing the same words in the identification test is also conducted to provide further evidence for the later discussion. 2. Comparison of Vietnamese (L1) and English (L2) Since L1 transfer figures prominently in this study, a brief comparison of L1 and L2 vowel systems is needed. As shown in (2) and (3), the vowel systems of both RP and Vietnamese have three degrees of contrastive height and also three for frontness. All RP front vowels and central vowels are unrounded. Except //, all RP back vowels are rounded. Similar to RP, all Vietnamese front vowels and central vowels are unrounded, and all back vowels are rounded.

(2) RP monophthongs2

Front Central Back Rounded Unrounded

Tense Lax Tense Lax

Tense Lax Tense Lax High i u Mid e Low æ

(adapted from Roca & Johnson, 1999, p. 190)

Qin: Investigating L1 Transfer in L2 Speech Perception

76

(3) Monophthongs in Hanoi Dialect3

Front Central Back High i u Mid e o Low a a

(adapted from Pham, 2008, p. 3)

On the one hand, the tense-lax contrasts in RP high front unrounded vowels (/i/-//) and high back rounded vowels (/u/-//) do not have phonological counterpart in Vietnamese. On the other hand, the phonological counterparts of the other two RP contrasts in this study, i.e. mid front unrounded vowel (/e/) vs. low front unrounded vowel (/æ/), and mid back rounded vowel (//) vs. low back rounded vowel (//), can be found in the Vietnamese vowel system (/e/-// and /o/-//) (see (4) for an illustration).

(4) Four RP vowel contrasts and their phonological counterparts in Vietnamese

RP Contrasts /i/-// /u/-// //-// /e/-/æ/

Counterparts in Vietnamese absent absent /o/-// /e/-//

Temporal cues are utilized by RP to differentiate vowels. Except /e/-/æ/, all the RP contrasts in this study (/i/-//, /u/-//, //-//) are with length distinction. Vietnamese also makes use of temporal cues, since the members in two Vietnamese vowel pairs (//-// and /a/-/a/) contrast in length, e.g. /bt/ “to reduce”, /bt/ “a kind of card game”, /bat/ “bowl”, /bat/ “to catch”. In view of the above comparison, if L1 is the decisive factor in L2 speech perception, it can be predicted that Vietnamese speakers would fail to spectrally distinguish RP /i/-// and /u/-//, and that they would have little difficulty to spectrally distinguish RP /e/-/æ/ and //-//. Because of the utilization of temporal cues in both languages, they would resort to

Qin: Investigating L1 Transfer in L2 Speech Perception

77

temporal cues to distinguish RP /i/-// and /u/-//, two contrasts with no phonological counterpart in L1. A summary of this prediction is provided in (5). (5) Predicted subjects’ perceptual performance

a. /i/-// → distinguished by length, not by quality b. /u/-// → distinguished by length, not by quality c. //-// → distinguished by both length and quality d. /e/-/æ/ → distinguished by quality

3. Methodology 3.1 Stimuli Considering the effect of Vietnam’s English Teacher and Trainer Network (VTTN), an English teacher training project organized by the British Council in 20 provinces in Vietnam since 2000,4 a British influence is assumed for the Vietnamese speakers from the VTTN provinces. For this reason, the stimuli in the identification test are RP vowels and the subjects are limited to Vietnamese speakers from the VTTN provinces. A group of synthetic RP pure vowels (/i/, //, /u/, //, //, //, /e/, /æ/) generated by Vowel Synthesis Interface (Bunnell, 1999) are used as the stimuli. The criterion for vowel synthesis is the average values of the first three formants in Henton (1983), in which the first three formants of the vowels produced by 10 male RP speakers have been measured. The fundamental frequency of the stimuli is 131 Hz, the pitch value of a normal male speaker. The lengths of six vowels (/i/, //, /u/, //, //, //) are manipulated to obtain length contrast. Thus, each of the six vowels has a long stimulus and a short stimulus. The lengths of

Qin: Investigating L1 Transfer in L2 Speech Perception

78

/e/ and /æ/ are not manipulated because they do not contrast in length, and both of them are treated as short vowels in this study. The lengths of the long stimuli and the short stimuli are 319 ms and 172 ms respectively, which are the average lengths of English long vowels and English short vowels before voiced consonants in Wiik (as cited in Cruttenden, 2008, p. 95). After the length manipulation, there are 14 stimuli in total, as shown in (6).

(6) The stimuli in the present study

Stimuli Number Vowel F1 (Hz) F2 (Hz) F3 (Hz) 1 [i] 272 2361 3056 2 [i] 272 2361 3056 3 [] 380 2085 2710 4 [] 380 2085 2710 5 [u] 347 1149 2300 6 [u] 347 1149 2300 7 [] 406 1103 2367 8 [] 406 1103 2367 9 [] 429 697 2441 10 [] 429 697 2441 11 [] 551 860 2530 12 [] 551 860 2530 13 [æ] 713 1615 2491 14 [e] 525 1943 2622

3.2 Subjects The subjects are chosen from third-year Vietnamese undergraduates majoring in Business Chinese at Guangxi Normal University. Four criteria have been used for the choice of subjects. Firstly, they should be from VTTN provinces. Secondly, they should have no experience of living in English-speaking countries. Thirdly, since most of the Vietnamese students start English learning at the age of junior high school, those who had not started English learning at this age are not chosen. Lastly, to avoid dialectal influence, the

Qin: Investigating L1 Transfer in L2 Speech Perception

79

subjects are chosen only from northern Vietnam, in which Northern (Hanoi) Dialect is spoken. 15 students meeting all of the above criteria are randomly chosen as the subjects. Of them, nine are females and six are males. They began Chinese learning since studying at undergraduate level. Vietnamese and Chinese are their most frequently used languages in daily life while the use of English only confines to certain courses and to the communication with peer foreign students who do not know Chinese.

3.3 Procedures 3.3.1 Identification Test Before the identification test, there is a discrimination test in which the subjects are asked to judge the acoustic differences between the stimuli. The discrimination test is irrelevant to the current topic, so it is not discussed in this paper. However, since the two tests use the same stimuli, it is assumed that the subjects would have noticed the differences between the stimuli after the discrimination test. In the identification test, all of the 14 stimuli are presented to the subjects. After hearing a stimulus, the subjects choose from two words on the answer sheets for the word that has a vowel part sounding more like the stimulus. For the /i/-// related stimuli ([i], [i], [], []), the alternatives on the answer sheets are beat and bit. For the /u/-// related stimuli ([u], [u], [], []), the alternatives are who’d and hood. For the //-// related stimuli ([], [], [], []), the alternatives are caught and cot. For the /æ/-/e/ related stimuli ([æ], [e]), the alternatives are had and head. Before the test, it has been ascertained that all words fall within the subjects’ vocabulary. Five repetitions of each stimulus are created and all the tokens are randomized. Thus,

Qin: Investigating L1 Transfer in L2 Speech Perception

80

there are 70 tokens of stimuli (14*5) in total. To prevent confusion, the number of each token is presented before the token in the recording. The time interval for neighboring tokens is five seconds, and the total length of the recording is 8 minutes and 30 seconds. The test is conducted in a language laboratory at Guangxi Normal University. The headphones are Lange LC EP16T, with a frequency range between 20 Hz and 20000 Hz and an impedance of 32 Ohm. A challenge for the identification test is that it might be inappropriate to force the subjects to label a sound as a phoneme, since the absolute phonetic realizations of these phonemes may vary among the subjects. Instead, it is more sensible to enable the subjects to make judgments based on the relative differences between the stimuli. In order to establish a criterion for the relative differences, all the short stimuli ([i], [], [u], [], [], [], [æ], [e]) are played twice in a practice trial before the test. The practice trial, along with the effect of the previous discrimination test, should be sufficient for the subjects to make comparisons for the contrast [i]-[], [u]-[], []-[] and [æ]-[e]. 3.3.2 Production Test A production test is conducted one day after the identification test in a quiet classroom. The same subjects are asked to read a word list containing the same words in the identification test (i.e. beat, bit, who’d, hood, caught, cot, had, head). All the test words are repeated two times and randomized. The words belonging to the same phonemic contrast do not neighbor with each other. Six dummy words (two tokens of apple, two tokens of banana, and two tokens of baby) have been inserted into the list to distract the subjects’ attentions on the target contrasts. There are altogether 22 words in the word list (16 test words and six

Qin: Investigating L1 Transfer in L2 Speech Perception

81

dummy words). Different from the conventional manner of /hVd/ context, this research uses words more familiar to the subjects because foreign language learners may not know words such as hod and hawed. The subjects read the word list individually. The recordings are made using Praat (5.1.31), with a sampling frequency of 22050 Hz. The same software will be used for subsequent acoustic analyses of the recordings. The microphone used in this study is a Philips SHM1500 headset, with a frequency range between 20 Hz and 11000 Hz and an impedance of 2200 Ohm. All recordings have been saved as WAV files. 4. Results 4.1 Identification Test The identification results can be divided into two types: the results of the normal-length stimuli and the results of the manipulated-length stimuli. The normal length stimuli refer to the stimuli matching RP both in length and in vowel quality. There are eight normal-length stimuli ([i], [], [u], [], [], [], [e], [æ]) in this study. Because they are normal both in length and in quality, it is easy to judge whether the identifications of these stimuli are correct or not. On the other hand, the manipulated-length stimuli refer to the stimuli matching RP in quality but reversing to RP in length. There are six such stimuli ([i], [], [u], [], [], []) in this study. Though strange in length, these stimuli can throw light on the relative weights of temporal cues and spectral cues in L2 speech perception. In fact, there is no absolute correctness for the identifications of these stimuli. For example, if one identifies an [] token as the vowel of bit, s/he makes a correct identification in terms of vowel quality but an incorrect one in terms of vowel length. In this

Qin: Investigating L1 Transfer in L2 Speech Perception

82

study, the criterion based on vowel quality is adopted, e.g. if one identifies an [] token as the vowel of beat, it would be termed as an incorrect identification. 4.1.1 The /i/-// Contrast For the contrast /i/-//, the normal-length stimuli are [i] and [], and the manipulated-length ones are [i] and []. (7) presents the incorrect identifications of these stimuli. “Mean of each subject” indicates the subjects’ average number of incorrect identifications. Its maximum value is five for each stimulus and 10 for “Two as a whole”. “% incorrect” indicates the overall misidentification percentage for each stimulus and for the two as a whole. (7) Incorrect identifications of the contrast /i/-//

Normal-length stimuli Manipulated-length stimuli Stimuli Parameters [i] [] Two as a whole [i] [] Two as a whole Mean of each subject 1.53 1.2 2.73 2.53 4.33 6.87 % incorrect 30.6 24 27.3 50.6 86.6 68.7

Because the normal-length stimuli match the vowel part of beat and bit both in quality and in length, their incorrect identification percentages are the lowest among the four stimuli. However, their overall incorrect percentage is still as high as 27.3%. This suggests that tense-lax or long-short distinction may not be closely associated with the pronunciations of bit and beat in the subjects’ L2 phonological systems. When the lengths are manipulated, the identification performance deteriorates markedly. Half of the [i] tokens (50.6%) are perceived as bit and a vast majority of the stimuli [] (86.6%) are identified as beat. When comparing the results of the two groups of stimuli, it is easy to find that temporal cue outweighs spectral cue in the perception of the contrast /i/-//.

Qin: Investigating L1 Transfer in L2 Speech Perception

83

4.1.2 The /u/-// Contrast The normal length stimuli for the contrast /u/-// are [u] and [], and the manipulated-length stimuli are [u] and []. (8) presents the identification results of these stimuli.

(8) Incorrect identifications of the contrast /u/-//

Normal-length stimuli Manipulated-length stimuli Stimuli Parameters [u] [] Two as a whole [u] [] Two as a whole Mean of each subject 1.13 1.8 2.93 3.07 3.73 6.8 % incorrect 22.6 36 29.3 61.4 74.6 68

Despite more than a fourth of the normal-length stimuli are perceived incorrectly, their incorrect identification percentages are the lowest among the four stimuli. When lengths are manipulated, the overall incorrect percentage ascends to 68%. As more than half of manipulated-length stimuli are identified on the basis of length, an influence of vowel duration is noticeable. The overall perceptual performance for the two types of stimuli indicates that the subjects rely primarily on temporal cue to distinguish who’d and hood. 4.1.3 The //-// Contrast [] and [] are the normal-length stimuli for // and //; [] and [] are the two manipulated-length stimuli. The identification results for these stimuli are provided in (9). (9) Incorrect identifications of the contrast //-//

Normal-length stimuli Manipulated-length stimuli Stimuli Parameters [] [] Two as a whole [] [] Two as a whole Mean of each subject 1.07 2.13 3.2 2.93 3.67 6.6 % incorrect 21.4 42.6 32 58.6 73.4 66

The overall misidentification percentage of the normal-length stimuli (32%) is significantly lower than that of the

Qin: Investigating L1 Transfer in L2 Speech Perception

84

manipulated-length ones (66%). However, the incorrect percentages, especially that of [], indicate that the subjects do encounter difficulty in identification. For the manipulated-length stimuli, the subjects are more prone to make judgments based on vowel length. Similar to the contrasts /i/-// and /u/-//, the perceptual performance for //-// reveals that the subjects distinguish this contrast mainly by length rather than by vowel quality. 4.1.4 The /e/-/æ/ Contrast [e] and [æ] are the only two stimuli for the contrast /e/-/æ/. Since they equal in length, quality difference is the only available cue. (10) provides the identification results for the two stimuli. (10) Incorrect identifications of the contrast /e/-/æ/ [e] [æ] Two stimuli as a whole Mean of each subject 1.87 2.13 3.93 % incorrect 37.3 42.6 39.3

Basically, the identifications of the two stimuli are problematic, since the overall incorrect percentage is as high as 39.3%. Without the help of temporal cue, it seems more difficult for the subjects to make judgments merely on the basis of vowel quality. 4.1.5 Summary Despite problems found in the identifications of all contrasts, the subjects show distinctions for these contrasts. Such distinctions, however, are made on the basis of vowel length rather than on vowel quality. When quality and length conflict with each other, quality gives way to length. This trend is demonstrated by the identifications of the manipulated-length

Qin: Investigating L1 Transfer in L2 Speech Perception

85

stimuli, which favor the lengths of the target-words. When the lengths of the stimuli become normal, the majority of the identifications are correct. 4.2 Production Test There are eight test words. Each word has 30 tokens (2*15). F1, F2, and duration of the vowel parts in the test words are measured. Because there is an influence of airstreams on the microphone in some recordings, the sound quality of these recordings are unsatisfactory. This makes some tokens immeasurable, or, gives unrealistic formant values. For instance, the F1 value of a beat token is 843 Hz, which is far higher than the normal F1 value of /i/. However, when judged by my ears, that token is a normal [bit]. If such unrealistic formant values are put into the analysis of the production data, an untrue picture for the subjects’ pronunciation would be drawn. For this reason, 5 tokens of beat, 5 tokens of bit, 10 tokens of hood, 9 tokens of who’d, 4 tokens of head, and 4 tokens had are excluded from the analysis of the production data. Probably because of a low and back articulatory position in the pronunciations of cot and caught, their original F1 and F2 are highly close, which causes the two formants to be indistinguishable by Praat. This closeness between F1 and F2, together with the unsatisfactory sound quality, makes the production data of cot and caught particularly problematic, with the majority of the cot tokens and caught tokens giving unrealistic values (higher than 800 Hz for F1, or higher than 1500 Hz for F2). In light of this, the contrast //-// has to be excluded from the production data analysis. 4.2.1 Production of beat and bit

Qin: Investigating L1 Transfer in L2 Speech Perception

86

(11) presents the overall production performance for beat and bit in all tokens (for further information, see Appendix A). The standard deviations between the tokens are also provided.

(11) Production of beat and bit Beat Bit

F1 (Hz)

F2 (Hz)

Duration (ms)

F1 (Hz)

F2 (Hz)

Duration (ms)

Mean of all tokens

391.68 2414.24 116.96 391.52 2439.44 105.88

Standard deviation

43.14 231.57 26.10 46.46 224.40 25.68

As shown in (11), the key values for the two words are very close. Very few subjects show a quality difference in production, and the pronunciations of the two words fluctuate across a wide range and often overlap (see Appendix A). In view of this, persuasive evidence for a quality difference between /i/ and // is not found from the production data. The average duration of the ea in beat is slightly longer than i in bit, but the role of temporal properties in production is not as prominent as in identification. 4.2.2 Production of who’d and hood The overall production performance for who’d and hood is shown in (12) (for more information, see Appendix B).

(12) Production of who’d and hood Who’d Hood

F1 (Hz) F2 (Hz) Duration (ms)

F1 (Hz) F2 (Hz) Duration (ms)

Mean of all tokens

417.61 916.05 203.14 460.8 1007.25 111.3

Standard deviation

73.15 172.69 46.08 58.99 149.04 30.59

Qin: Investigating L1 Transfer in L2 Speech Perception

87

Evidence for a quality difference between who’d and hood is found from the production data. Both the average F1 value and the average F2 value in hood are higher than those in who’d, which suggests a more front and lower tongue position for the vowel in hood than the vowel in who’d. Of the 11 subjects with analyzable recordings, seven subjects show a quality distinction between the two words in all tokens (see Appendix B). The remaining four subjects do not exhibit a clear borderline between the pronunciations of the two words in terms of vowel quality. As for vowel length, most of the who’d tokens are noticeably longer than the hood tokens. 4.2.3 Production of head and had The overall production performance for head and had is provided in (13) (see Appendix C for further information).

(13) Production of head and had Head Had

F1 (Hz) F2 (Hz) Duration (ms)

F1 (Hz) F2 (Hz) Duration (ms)

Mean of all tokens

690.44 2155.08 132.64 764.58 2046.39 135.04

Standard deviation

141.61 231.51 33.77 68.09 217.39 37.38

The average values of F1 and F2 indicate that there is a quality difference between the vowel parts of head and had, with ea in head being realized in a more front and higher position. Nevertheless, the situation is complex due to the individual differences, which can be reflected by the high standard deviations. The pronunciation manner of the average values is manifested by only six out of the 13 subjects with analyzable recordings (see Appendix C). There are two subjects whose F1 values in head are even higher than in had, which suggests that /e/ is at a lower articulatory position than /æ/. The remaining

Qin: Investigating L1 Transfer in L2 Speech Perception

88

five subjects do not show a clear quality distinction between head and had. For them, the pronunciations of the two words may be the same. No obvious length difference between the two words is found from the production data. 5. Discussion Basically, the identification results suggest that, in terms of vowel quality, the subjects encounter difficulties in perceiving the stimuli as proper English phonemes. Instead, they are more prone to make judgments on the basis of vowel length. The overall perceptual performance of the four contrasts can be summarized as follows: (14) Overall performance in the identification test

a. /i/-// → distinguished by length, not by quality b. /u/-// → distinguished by length, not by quality c. //-// → distinguished by length, not by quality d. /e/-/æ/ → no distinction

Three contrasts in this study (/i/-//, /u/-//, //-//) have length difference. The subjects tend to distinguish them by temporal cues, and no evidence for the use of spectral cues can be found. For the contrast /e/-/æ/, since spectral difference is the only available cue, the overall identification results are nearly at a free-variation level (37.3% incorrect percentage for /e/, 42.6% incorrect percentage for /æ/). 5.1 Phonological Perspective Recall that, in terms of vowel quality, the contrasts /i/-// and /u/-// found in RP do not have phonological counterpart in Vietnamese, and that the phonological counterparts of the RP contrasts //-// and /e/-/æ/ can be found in Vietnamese. In

Qin: Investigating L1 Transfer in L2 Speech Perception

89

addition, temporal cues and spectral cues are utilized by both languages. Based on a phonological comparison of RP and Vietnamese, the following prediction can be made (cf. (5)). (15) Predicted subjects’ perceptual performance

a. /i/-// → distinguished by length, not by quality b. /u/-// → distinguished by length, not by quality c. //-// → distinguished by both length and quality d. /e/-/æ/ → distinguished by quality

The subjects’ perceptual performance in the contrasts /i/-// and /u/-// is consistent with the prediction. For the contrast //-//, due to the experiment design, i.e. there are two cues (spectral and temporal) to choose, the subjects show strong preference to make judgments on a temporal basis and ignore the spectral difference. This pattern is particularly conspicuous in the perception of manipulated-length stimuli [] and [] – when temporal cue and spectral cue conflict, spectral cue gives way to temporal cue. However, since temporal cues are also exploited in Vietnamese, this phenomenon does not rule out the possibility of the phonological L1 transfer. Rather, it can be understood as L1 temporal cues are more prominent than L1 spectral cues when the two L1 cues co-occur in L2 contrasts. If the subjects’ perceptual performance in the contrasts /i/-//, /u/-//, and //-// is explicable by L1 transfer, the case of the contrast /e/-/æ/ is a little problematic. From a phonological perspective, mid front unrounded vowel and low front unrounded vowel exist in both English and Vietnamese. When Vietnamese speakers acquire the English contrast /e/-/æ/, positive L1 transfer is expected to facilitate their distinction of this contrast. Contrary to the above assumption, the overall perceptual performance for the contrast /e/-/æ/ indicates that the subjects have difficulty to properly identify

Qin: Investigating L1 Transfer in L2 Speech Perception

90

/e/ and /æ/, which is opposite to the prediction in (15). At this point, the idea of L1 transfer cannot ascertain because the phonological comparison fails to explain the subjects’ problems in the identifications of /e/ and /æ/. 5.2 Phonetic Comparison of L1 and L2 An alternative to the phonological approach above is to make a phonetic comparison of L1 and L2 vowel systems. Indeed, a comparison at phonetic level is in line with the Speech Learning Model (Flege, 1995) and the Perception Assimilation Model (Best, 1995), since both L2 speech perception models are based on phonetic similarity and dissimilarity between L1 and L2. If there is L1 transfer, the subjects would perceive the stimuli, which are RP pure vowels, in terms of L1 pure vowels. To this end, a group of Vietnamese pure vowels read by a male Hanoi Dialect speaker are recorded and measured via Praat (5.1.31). The microphone and the recording condition are the same as in the production test. A phonetic comparison of Vietnamese pure vowels and the stimuli is provided in F1-F2 plots (see (16) below) produced by JPlotFormants (Billerey-Mosier, 2002).

Qin: Investigating L1 Transfer in L2 Speech Perception

91

(16) Phonetic comparison of Vietnamese pure vowels and the stimuli5

Seen from the above plots, though the vowel systems of the two languages have similar phonemic contrasts (three degrees of openness, three degrees of backness), their phonetic realizations vary greatly. If the above plots simplified into a vowel space quadrilateral, it is probable that L1 transfer may occur at phonetic level, as illustrated in (17).

(17) Quadrilateral of Vietnamese pure vowels and stimuli

Legend: “E” refers to RP; “V” refers to Vietnamese. The symbols before “E” or

“V” are IPA symbols. E.g. “E” denotes RP //, and “iV” denotes Vietnamese /i/.

In (17), it is assumed that different RP vowels may be perceived

iV = Vietnamese /i/ eV = Vietnamese /e/

aV = Vietnamese // wV = Vietnamese //

ee = Vietnamese // er = Vietnamese //

AV = Vietnamese /a/ A2 = Vietnamese /a/

uV = Vietnamese /u/ oV = Vietnamese /o/

0V = Vietnamese //

iE = RP /i/ IE = RP //

eE = RP /e/ aE = RP /æ/

uE = RP /u/ vE = RP //

oE = RP // 0E = RP //

Qin: Investigating L1 Transfer in L2 Speech Perception

92

as the same Vietnamese vowel due to their phonetic similarity. In this paper, such phenomenon is called as assimilation. For the RP contrasts /i/-// and /u/-//, this assumption of phonetic assimilation provides further proof for the phonological transfer. So far, L1 transfer has been be the most obvious explanation for the subjects’ perceptually performance in the contrasts /i/-// and /u/-//. For the RP contrast //-//, the phonetic assimilation offers a reasonable explanation for the subjects’ ignorance of spectral difference in identification: because they cannot phonetically associate RP // and // with Vietnamese /o/ and //, they resort to temporal cue, which is more reliable compared with spectral cue. In this sense, the prominence of length cue in the identification of this contrast also obtains a plausible explanation. For the RP contrast /e/-/æ/, chances are that both RP /e/ and /æ/ are assimilated to Vietnamese //. In fact, this phonetic assimilation well answers the puzzle why the phonological L1 transfer does not occur. In light of this, despite the overall difficulty in the perception of /e/ and /æ/ cannot be explained by the phonological L1 transfer, it can still be accounted for by the phonetic L1 transfer.

5.3 Production Perspective Given the above assumption, it seems like L1 transfer can really account for all. However, considering the subjects’ performance in production, the case may be somewhat different. 5.3.1 The /i/-// Contrast (18) compares the subjects’ average production performance for the RP contrast /i/-// with the Vietnamese pure vowels and the stimuli. In (18), the production /i/ refers to the ea in beat produced by the subjects, and the production // refers to i in

Qin: Investigating L1 Transfer in L2 Speech Perception

93

bit pronounced by them. (18) Comparison of the production of /i/-//, the Vietnamese vowels, and the stimuli

The subjects’ pronunciations of /i/ and // overlap in (18). Besides, both /i/ and // in production are phonetically close to Vietnamese /i/. Thus, the subjects’ production of /i/ and // further confirms the decisiveness of L1 transfer in the perception of the contrast /i/-//.

5.3.2 The /u/-// Contrast The subjects’ pronunciations of who’d and hood show some evidence for a quality difference. (19) provides a comparison between the subjects’ average production of /u/ and //, the Vietnamese pure vowels and the stimuli. In (19), the production /u/ refers to the vowel part of who’d produced by the subjects, and the production // refers to their production of oo in hood. The two ellipses represent the standard deviation range of the production data. The minor axis paralleling to the F1 axis represents the deviation range of F1, and the major axis paralleling to the F2 axis denotes the deviation range of F2.

iE = RP /i/ IE = RP //

eE = RP /e/ aE = RP /æ/

uE = RP /u/ vE = RP //

oE = RP // 0E = RP //

iV = Vietnamese /i/ eV = Vietnamese /e/

aV = Vietnamese // wV = Vietnamese //

ee = Vietnamese // er = Vietnamese //

AV = Vietnamese /a/ A2 = Vietnamese /a/

uV = Vietnamese /u/ oV = Vietnamese /o/

0V = Vietnamese //

iP = English /i/ in production

IP = English // in production

Production of

/i/ and //

Qin: Investigating L1 Transfer in L2 Speech Perception

94

(19) Comparison of the production of /u/-//, the Vietnamese vowels, and the stimuli

Though the two ellipses overlap in some part, they do not overlap substantially. This suggests that there exists a quality difference between the two words in, at least, some subjects’ production data. Indeed, of the 11 subjects with analyzable recordings, only four subjects do not exhibit a clear quality difference in production, and the other seven subjects show a clear quality difference between the pronunciations of who’d and hood in all tokens (see Appendix B). The four subjects’ absence of quality difference can be attributed to L1 transfer, but the other seven subjects’ production data cast doubt on the decisiveness of L1 transfer: though L1 transfer can provide an explanation for the subjects’ perception performance, it fails to explain the quality difference in the production data. In view of the inadequacy of L1 transfer as an explanation, an alternative is needed. Note that both spectral cue and temporal cue are used in English to differentiate /u/ and //, and that the seven

uP = English /u/ in production

vP = English // in production

iV = Vietnamese /i/ eV = Vietnamese /e/

aV = Vietnamese // wV = Vietnamese //

ee = Vietnamese // er = Vietnamese //

AV = Vietnamese /a/ A2 = Vietnamese /a/

uV = Vietnamese /u/ oV = Vietnamese /o/

0V = Vietnamese //

iE = RP /i/ IE = RP //

eE = RP /e/ aE = RP /æ/

uE = RP /u/ vE = RP //

oE = RP // 0E = RP //

Production of // Production of /u/

Qin: Investigating L1 Transfer in L2 Speech Perception

95

subjects pronounce who’d and hood differently both in length and in quality. In this sense, their production performance matches the pattern of English quite well. This fact brings about an account for the production data:

(20) Account for the seven subjects’ production data of /u/ and //

The subjects’ L2 phonological systems are influenced more by target-language – English rather than by L1. In other words, L2 speech perception and production can simply be explained by L2 itself without the involvement of L1 transfer.

In fact, this account can also provide a plausible explanation for the perception data: L2 speech perception adheres to L2 pattern with no involvement of L1 transfer, but L2 temporal cue is more prominent than L2 spectral cue especially when the two cues conflict. Compared with L1 transfer, which fails to explain the production data, this account is more promising and more appealing in that it can explain both the perception data and the production data. If this account is true, the influence of L1 transfer is minimal in these subjects’ perception of the contrast /u/-//.

5.3.3 The /e/-/æ/ Contrast For the production of head and had, some heterogeneity is found among the subjects. Their production performance can be divided into three types: (a) with a normal quality distinction, (b) without quality distinction, (c) with a reversed quality distinction. Among the 13 subjects with available production data, six make a normal quality distinction (see Appendix C). (21) compares their production of /e/ and /æ/ with the Vietnamese pure

Qin: Investigating L1 Transfer in L2 Speech Perception

96

vowels and the stimuli. In (21), the production /e/ denotes the vowel part of head produced by the subjects, and the production /æ/ represents the vowel part of had pronounced by them. (21) Comparison of the first group’s production of /e/ and /æ/, the Vietnamese

vowels, and the stimuli

As can be seen in (21), the English [e] and [æ] produced by this group of subjects are two different phonemes in the phonetic space. In fact, this group manifests satisfactory identification performance, since all of them make incorrect identifications no more than 30% (see (22)). Note that no temporal cue can be utilized in the perception of this contrast. In this sense, their production performance well matches their perception performance, and both are consistent with the prediction in (15). Hence, the phonological L1 transfer, which facilitates the distinction of the contrast /e/-/æ/, may exist in their perception of /e/ and /æ/.

(22) Identification performance of the first group

Subject number [e] [æ] Two stimuli as a whole 7 2 0 2 9 0 3 3

eP = English /e/ in production

aP = English /æ/ in production

iE = RP /i/ IE = RP //

eE = RP /e/ aE = RP /æ/

uE = RP /u/ vE = RP //

oE = RP // 0E = RP //

Production of /e/ Production of /æ/

iV = Vietnamese /i/ eV = Vietnamese /e/

aV = Vietnamese // wV = Vietnamese //

ee = Vietnamese // er = Vietnamese //

AV = Vietnamese /a/ A2 = Vietnamese /a/

uV = Vietnamese /u/ oV = Vietnamese /o/

0V = Vietnamese //

Qin: Investigating L1 Transfer in L2 Speech Perception

97

10 1 1 1 11 1 2 3 13 1 0 1 15 1 1 2

It is also worth noting that, in terms of degree of openness, these subjects’ pronunciations of /e/ and /æ/ incline towards target language (English) than native language (Vietnamese). This phenomenon makes the account for the contrast /u/-// (cf. (20)) applicable to /e/-/æ/. That is, these subjects’ L2 phonological systems are target-like in themselves and do not involve L1 transfer. Since both above explanations are possible, a conclusion for these subjects can be drawn as: there may be phonological L1 transfer in the perception of /e/ and /æ/, but the two vowels are phonetically more towards target-language in the subjects’ L2 phonological systems. The other five subjects’ pronunciations of English /e/ and /æ/ do not manifest a clear quality difference (see Appendix C). (23) provides a comparison between their production of the contrast /e/-/æ/, the Vietnamese pure vowels, and the stimuli.

Qin: Investigating L1 Transfer in L2 Speech Perception

98

(23) Comparison of the second group’s production of /e/ and /æ/, the Vietnamese vowels, and the stimuli

In (23), the second group’s pronunciations of /e/ and /æ/ are close to each other in the phonetic space, and the sound closest to both of them is Vietnamese //. From this, two points can be inferred. First, there is no distinction between /e/ and /æ/ in these subjects’ L2 phonological systems. Second, the vowel parts of both head and had are assimilated to Vietnamese //. In the mean time, the misidentification percentages of these subjects range between 40% and 50%, an almost chance level. Considering the absence of the /e/-/æ/ distinction in production, their perceptual performance is easy to explain – because the pronunciations of head and had are the same to them, they do identifications by guessing. For these subjects, since both English /e/ and /æ/ are assimilated to Vietnamese //, the root of their problems in perception and in production is the phonetic L1 transfer illustrated in (17). The remaining two subjects (subject 1, 4) show a reversed

eP = English /e/ in production

aP = English /æ/ in production

iE = RP /i/ IE = RP //

eE = RP /e/ aE = RP /æ/

uE = RP /u/ vE = RP //

oE = RP // 0E = RP //

Production of /e/

iV = Vietnamese /i/ eV = Vietnamese /e/

aV = Vietnamese // wV = Vietnamese //

ee = Vietnamese // er = Vietnamese //

AV = Vietnamese /a/ A2 = Vietnamese /a/

uV = Vietnamese /u/ oV = Vietnamese /o/

0V = Vietnamese //

Production of /æ/

Qin: Investigating L1 Transfer in L2 Speech Perception

99

quality distinction. They pronounce a in had even with lower F1 values than ea in head, which reveals that /æ/ is even at a higher articulatory position than /e/ in their L2 phonologies. Their identifications of /e/ and /æ/ are highly problematic. Given the reversed quality distinction, their identification performance is not surprising at all. For the two subjects, the root of their perception problems is misconception in L2 phonological systems rather than L1 transfer. 5.3.4 The //-// Contrast Since there is no available production data for the contrast //-//, it is difficult to conduct a discussion from the production perspective. However, as is discussed in Section 4.2, there may be a low and back articulatory position for the production of RP // and //. The low and back position might suggest that the two vowels are pronounced phonetically close to Vietnamese // rather than to Vietnamese /o/. This would run counter to the phonetic L1 transfer in (17). If the low and back position is true, the phonetic L1 transfer for this contrast can be discarded. If the above assumption is further extended, there exist two possible cases. In the first case, the subjects make a quality distinction in production. If so, there may exist phonological L1 transfer which facilitates the differentiation of // and //. The subjects’ perceptual performance can be attributed to the prominence of L1 temporal cue as is discussed in Section 5.1. In the second case, the subjects make no quality distinction in production. If so, L1 transfer can hardly be the root of the problems in perception and in production because neither the phonological L1 transfer nor the phonetic L1 transfer can explain such problems.

Qin: Investigating L1 Transfer in L2 Speech Perception

100

5.4 Summary In the preceding sub-sections, the role of L1 in L2 speech perception has been discussed through a phonological and phonetic comparison of RP and Vietnamese and through an analysis of the production data. It is found that the effect of L1 transfer varies among different L2 contrasts. For the contrast /i/-//, L1 transfer is the decisive factor in perception. For the contrast /u/-//, though L1 transfer exerts great influence, it is not the decisive factor. The majority of the subjects are influenced more by L2 than by L1, and there are some phenomena that L1 transfer fails to explain. For the contrast /e/-/æ/, L1 transfer can occur at phonological level or phonetic level. Though the contrast /e/-/æ/ has phonological counterpart in L1, the phonetic L1 transfer can still make perception problematic. For the contrast //-//, there exist two possibilities depending on whether there is a quality difference in production. In the first case, there is phonological L1 transfer. In the second case, the problems in perception and production are caused by factors other than L1 transfer. Given the discussion, the research question in (1) can be answered as follows: (24) Effect of L1 transfer in L2 speech perception

1. The effect of L1 transfer varies among different L2 contrasts. It is decisive in the perception of some contrasts, but its effect is minor in some other contrasts.

Qin: Investigating L1 Transfer in L2 Speech Perception

101

It cannot fully determine the perception of contrasts with no L1 phonological counterpart.

2. L1 transfer can occur at phonological level or phonetic

level. Different levels of L1 transfer may lead to different perceptual performance. Because of the phonetic L1 transfer, the perception of contrasts with L1 phonological counterparts can also be problematic.

6. Conclusion To examine the effect of L1 transfer in L2 speech perception, the present study investigates Vietnamese speakers’ perception of four RP vowel contrasts. It is found that Vietnamese speakers rely primarily on temporal cues to distinguish English vowel contrasts. When there is no temporal cue, the identification performance would be highly problematic. The perception performance and the production performance may vary among different L2 contrasts. Neither the perception performance nor the production performance is fully consistent with the phonological comparison of L1 and L2. Through an analysis of the perception and production data, and a phonological and phonetic comparison of L1 and L2, the effect of L1 transfer can be concluded as:

1. L1 transfer is a crucial factor in L2 speech perception, but its

effect may vary among different L2 contrasts. 2. L1 transfer can occur at phonological level or phonetic level.

Even with L1 phonological counterparts, L2 contrasts can still be perceived poorly under the influence of the phonetic L1 transfer.

Besides answering existing theoretical questions, this study also provides empirical evidence for the studies of L2 speech

Qin: Investigating L1 Transfer in L2 Speech Perception

102

perception. Finally, it should be mentioned that all the above conclusions are based on a logical analysis of the available data and need to be further tested. It is hoped that more studies can dedicate themselves to the issue of this study and unveil the nature of L1 transfer in L2 speech perception. Appendix A: Production Data of Beat and Bit (“-” refers to absent data)

Beat Bit Subject F1 (Hz) F2 (Hz) Duration

(ms) F1 (Hz) F2 (Hz) Duration

(ms)

Clear

borderline 417 2318 95 458 2347 117 1 348 2241 139 370 2295 135

421 2535 129 370 2655 110 2 483 2543 106 363 2675 120

376 2663 117 390 2475 124 3 401 2734 90 391 2761 102

417 2598 138 353 2587 89 4 327 2559 120 381 2623 135

382 2132 94 358 2189 86 5 308 2245 85 313 2283 85

398 2539 84 494 2549 76 6 455 2653 94 467 2521 77

370 2378 106 385 2224 114 7 437 2193 107 394 2462 130

- - - - - - 8 445 2595 126 472 2590 117

418 2256 130 406 2276 95 9 386 2217 136 429 2220 120

342 2173 170 323 2257 151 10 375 2056 182 326 2149 138

443 2491 119 430 2680 133 11 391 2779 104 420 2939 115

- - - - - - 12 - - - - - -

376 2024 86 388 2129 62 13 390 2139 97 372 2092 63

- - - - - - 14 - - - - - -

334 2602 154 379 2414 68 15 352 2693 116 356 2594 85

Mean 391.68 2414.24 116.96 391.52 2439.44 105.88

Qin: Investigating L1 Transfer in L2 Speech Perception

103

Standard deviation

43.14 231.57 26.10 46.46 224.40 25.68

Appendix B: Production Data of Who’d and Hood (“-” refers to absent data)

Who’d Hood Subject F1

(Hz) F2 (Hz) Duration

(ms) F1

(Hz) F2 (Hz) Duration

(ms)

Clear

borderline 457 1001 190 548 1100 130 1 463 1033 180 462 1138 95

456 779 195 - - - 2 530 819 235 526 904 60

426 892 191 507 999 109 3 445 860 173 558 1067 119

486 863 240 472 936 125 4 460 1078 140 527 1195 160

366 1039 157 390 958 81 5 303 996 145 439 914 86

- - - - - - 6 - - - - - -

491 1102 169 442 1399 88 7 516 1091 157 433 1188 99

366 901 222 542 1118 92 8 443 932 228 - - -

378 686 250 466 786 131 9 350 684 245 448 825 146

295 657 272 361 860 136 10 301 725 280 360 923 157

- - - - - - 11 - - - - - -

- - - - - - 12 - - - - - -

- - - 407 1018 78 13 515 1346 132 423 1006 66

- - - - - - 14 - - - - - -

377 988 272 443 920 154 15 469 850 193 462 891 114

Mean 417.61 916.05 203.14 460.8 1007.25 111.3 Standard deviation

73.15 172.69 46.08 58.99 149.04 30.59

Qin: Investigating L1 Transfer in L2 Speech Perception

104

Appendix C: Production Data of Head and Had (“-” refers to absent data)

Head Had Subject F1

(Hz) F2 (Hz) Duration

(ms) F1

(Hz) F2 (Hz) Duration

(ms)

Clear

borderline 825 1974 119 782 2034 129 1 806 2040 157 772 2044 148

867 2122 130 836 2014 128 2 852 2078 117 849 2051 143

781 2089 207 812 2118 128 3 814 2118 140 845 2136 170

867 2243 175 780 2028 176 4 845 2034 167 786 2022 194

653 1916 118 730 1858 90 5 685 1963 95 667 1859 92

751 2414 119 751 2275 99 6 777 2391 125 802 2267 99

433 2315 107 675 2113 119 7 467 2061 111 662 2127 139

837 2104 156 875 2105 134 8 864 2162 179 833 2116 134

566 2043 136 774 1865 132 9 563 1983 112 794 1849 121

- - - 717 1617 164 10 537 2065 169 746 1624 180

551 2761 105 856 2516 131 11 618 2822 121 780 2593 137

- - - - - - 12 - - - - - -

574 2044 68 642 2036 69 13 592 1890 69 638 2052 58

- - - - - - 14 - - - - - -

553 2169 160 765 1937 198 15 583 2076 154 710 1950 199

Mean 690.44 2155.08 132.64 764.58 2046.39 135.04 Standard deviation

141.61 231.51 33.77 68.09 217.39 37.38

Notes * This paper grew out of the research for my MA dissertation. Thanks go to Guangxi Normal University for assistance and to the subjects for participation. I am greatly

Qin: Investigating L1 Transfer in L2 Speech Perception

105

indebted to Lian-Hee Wee for his valuable guidance and suggestions. I also wish to thank the reviewer for many helpful comments. 1 The reason to use of RP vowels is provided in Section 3.1. 2 Some symbols for transcription are different from the original source. 3 Since all the subjects in this study are from northern Vietnam in which Northern Dialect (Hanoi Dialect) is spoken, only the vowel inventory of Hanoi Dialect is listed here. In this paper, Vietnamese normally refers to Hanoi Dialect. 4 For more about the VTTN program, cf. http://www.britishcouncil.org/vietnam-english-teacher-vttn-network.htm and http://www.teachingenglish.org.uk/elt-projects/vietnam-english-teacher-trainer-network-vttn-project. 5 “AV” and “A2” overlap in (16). References Beddor, P. S., & Strange, W. (1982). Cross-language study of perception of the

oral-nasal distinction. Journal of the Acoustical Society of America, 71, 1551-1561. Best, C. T. (1995). A direct realist view of cross-language speech perception. In W.

Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 171-204). Timonium: York Press.

Best, C. T., & Strange, W. (1992). Effects of phonological and phonetic factors on

cross-language perception of approximants. Journal of Phonetics, 20, 305-330. Billerey-Mosier, R. (2002). JPlotFormants (Version 1.4) [Computer program].

Retrieved August 20, 2010, from http://www.linguistics.ucla.edu/people/grads/billerey/PlotFrog.htm.

Boersma, P., & Weenink, D. (2010). Praat: Doing phonetics by computer (Version

5.1.31) [Computer program]. Retrieved August 14, 2010, from http://www.praat.org/.

Bohn, O.-S. (1995). Cross-language speech perception in adults: First language

transfer doesn’t tell it all. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 279-304). Timonium: York Press.

Bohn, O.-S., & Flege, J. E. (1990). Interlingual identification and the role of foreign

language experience in L2 vowel perception. Applied Psycholinguistics, 11, 303-328.

Qin: Investigating L1 Transfer in L2 Speech Perception

106

British Council. (n.d.). Vietnam’s English Teacher and Trainer Network (VTTN).

Retrieved June 15, 2010, from http://www.britishcouncil.org/vietnam-english-teacher-vttn-network.htm.

British Council, & British Broadcasting Corporation. (2008, November 19). The

Vietnam English Teacher and Trainer Network (VTTN) Project. Retrieved June 15, 2010, from http://www.teachingenglish.org.uk/elt-projects/vietnam-english- teacher-trainer-network-vttn-project.

Bunnell, H. T. (1999). Simplified Vowel Synthesis Interface [Computer program].

Retrieved June 17, 2010, from http://www.asel.udel.edu/speech/tutorials/synthesis/vowels.html.

Cruttenden, A. (2008). Gimson’s pronunciation of English (7th ed.). London: Hodder

Education. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems.

In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 233-277). Timonium: York Press.

Flege, J. E., & Eefting, W. (1988). Imitation of a VOT continuum by native speakers of

English and Spanish: Evidence for phonetic category formation. Journal of the Acoustical Society of America, 83, 729-740.

Gottfried, T. L., & Beddor, P. S. (1988). Perception of temporal and spectral

information in French vowels. Language and Speech, 31, 57-75. Henton, C. G. (1983). Changes in the vowels of received pronunciation. Journal of

Phonetics, 11, 353-371. Iverson, P., & Kuhl, P. K. (1995). Mapping the perceptual magnet effect for speech

using signal detection theory and multidimensional scaling. Journal of the Acoustical Society of America, 97, 553-562.

Pham, A. H. (2008). The non-issue of dialect in teaching Vietnamese. Journal of

Southeast Asian Language Teaching, 14, 22-39. Roca, I., & Johnson, W. (1999). A course in phonology. Oxford: Blackwell. Rochet, B. L. (1995). Perception and production of second-language speech sounds by

adults. In W. Strange (Ed.), Speech perception and linguistic experience: Issues in cross-language research (pp. 379-410). Timonium: York Press.

Sung, E.-K. (2005). Perception of flaps in American English and Korean. In J. Cohen,

K. T. McAlister, K. Rolstad, & J. MacSwan (Eds.), Proceedings of the 4th International

Qin: Investigating L1 Transfer in L2 Speech Perception

107

Symposium on Bilingualism (pp. 2197-2221). Somerville: Cascadilla Press. Retrieved September 25, 2010, from www.lingref.com/isb/4/172ISB4.PDF.

About the author Qin Chuan is currently studying in MA in Language Studies program at Hong Kong Baptist University. He got his BA in English at Guangxi Normal University. His research interests include phonetics, phonology, second language acquisition, and world Englishes. Email: [email protected]