boundary placement in connected speech

1
S33 88th Meeting: Acoustical Society of America S33 J. Acoust. Soc. Am. 33, 1174--1178 (1961)]; (2)constant ratio [e.g., 2:3; G. Peterson and I. Lehiste, J. Acoust. Soc. Am. 32, 693--703 (1960)];(3) general linear relationship [Lindbiota and Rapp, PILUS No. 21 (1973)]; (4) constant relative variance [Allen and Cooper, J. Acoust. Soc. Am. 53, 379(A) (1973)]. The usefulness of duration as a cue will be discussed, in rela- tion to both theories of time perception and phonological rules for segment length. [Supported by NSF Grant GS-41863. ] 11:00 P9. Perceptual measure of cue variability in production under different speaking conditions. M.S. Harris and N. Umeda (Bell Laboratories, Acoustics Research Department, Murray Hill, New Jersey 07974) In the first pilot study of an extensive investigation into the variability of cue production and cue utilization in perception, subjects were asked to identify which member of a pair of in- ternal openjuncture phrases (grey day--grade A) they heard. Test phrases were spokenby a male speaker (1) in isolation, and (2) in context and edited from the context. In condition 1, 16 subjects responded with less than 3% errors. In condition 2, 24 subjects yieldedan error rate of 20%. It seems that cues used by the speaker in condition I are not present to the same degree in condition 2, causing a breakdown in perfor- mance without the aid of contextual information. The results support the hypothesis that in speech production there are ad- justments in cue formation according to speaking situation. We believe that on the listener's side, as well, in accordance with contextual aid, noise level, meaningfulness, familiarity, etc., there must be variable utilization of cues that are present in the speech wave. Results from extensive studies on cue vari- ability in production and perception are presented. 11:15 P10. Perception of suprasegmental and segmental features. Sara Garnes (Department of Linguistics, Ohio State University, Columbus, Ohio 43214) This study reports the results of an experiment in which per- ception of a suprasegmental feature dominated perception of a segmental feature. The test language is Icelandic. Stimuli based ona minimaltriple/ka :ka/ "cake,"/kak: a/ "keg (oblique) ," and/kahka/ "to heap up," wereprepared ona terminal analog synthesizer. Variables include duration of V1, C2, and the absence or presence of four durations of pre- aspiration, [h]. If no preaspirationis present, subjects choose /kak: a/or/ka :ka/, depending on the ratio of V•/C 2 durations. If preaspirationis present, subjects choose/kahka/ or /ka: ka/, again only with favorable V•/C 2 ratios. The fact that the word/ka :kof is chosen, in spite of the presence of an otherwise misplaced segmental feature indicates that percep- tion of the suprasegmental feature length dominates perception of the segmental feature preaspiration. Evidence from studies of production and perception of American English is discussed which supports the observation made here. A model of per- ception in which suprasegmental feature detection is warranted. Such a model--in the form of a decision tree--is proposed. 11:30 Pll. Boundary.placement in connected speech. N. Umeda and M.O. Harris (Bell Laboratories, Murray Hill, New Jersey 07974) Pitch is closely correlated with stress and boundary (pause and pseudopause). Stress and pause are not mutually indepen- dent; they often form an auditory unit in speech. Such units, we believe, must help listeners to grasp the grammatical and semantic framework of the message. Therefore, before we study fundamental frequency at the acoustic level, we chose to investigate the relation between sentence structure and the placement of boundaries. Two people, whose backgrounds in- clude linguistics and some musical training, listened care- fully to an essay read by four speakers, indicating boundariess where they perceived th•m. The listeners were concentrating more on overall perception of boundary than on its different acoustic characteristics. Analysis of preliminary data shows the listeners to be in 90% agreement with each other about the location of boundary reponses. Boundaries were perceived be- tween two successive words in these conditions: on 10% of all boundaries within the same phrase; on 50% of grammatical-unit boundaries in normal structural order, such as subject-verb; and on 70% of boundaries in reversed structural order, such as interruptions. Eighty percent of the boundary events noted by the listeners were heard as occurring in all four speakers. 11:45 P12. Characterization of fundamental-frequency contours of speech. S. Maeda (Department of Electrical Engineering and Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts02139) Fundamental frequency (F 0) contours for 60 isolated Ameri- can English sentences and a text, read by three native speakers, have been analyzed and these contours have been schematized subjectively by ideal patterns. The basic ele- ments (attributes) of the schematic patterns are a baseline (BL) which represents the gradual fall of the F 0 contour along the sentence, a piecewise-linear trapezoidal pattern with rising (•) and lowering (L), and a peak (P) which often occurs at the onset of R. The trapezoidal pattern demarcates a sen- tence into phonetic groups (PGs) of words. Additional attributes are a rising (R1) which can occur at an intermediate content word in PG, and continuation rise. Intraspeaker differences for these schematic patterns were negligible for the sentences in- vestigated. The organization of the patterns seems to be con- strained by two primary factors: the structure of the sentence and a principle of physiological economy in the control of F 0 during speech. [Researchsupported by an NIH grant. ] J. Acoust.Soc.Am., Vol. 56, Supplement Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Fri, 19 Dec 2014 06:42:00

Upload: n

Post on 13-Apr-2017

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Boundary placement in connected speech

S33 88th Meeting: Acoustical Society of America S33

J. Acoust. Soc. Am. 33, 1174--1178 (1961)]; (2)constant ratio [e.g., 2:3; G. Peterson and I. Lehiste, J. Acoust. Soc. Am. 32, 693--703 (1960)]; (3) general linear relationship [Lindbiota and Rapp, PILUS No. 21 (1973) ]; (4) constant relative variance [Allen and Cooper, J. Acoust. Soc. Am. 53, 379(A) (1973)]. The usefulness of duration as a cue will be discussed, in rela- tion to both theories of time perception and phonological rules for segment length. [Supported by NSF Grant GS-41863. ]

11:00

P9. Perceptual measure of cue variability in production under different speaking conditions. M.S. Harris and N. Umeda (Bell Laboratories, Acoustics Research Department, Murray Hill, New Jersey 07974)

In the first pilot study of an extensive investigation into the variability of cue production and cue utilization in perception, subjects were asked to identify which member of a pair of in- ternal open juncture phrases (grey day--grade A) they heard. Test phrases were spoken by a male speaker (1) in isolation, and (2) in context and edited from the context. In condition 1, 16 subjects responded with less than 3% errors. In condition 2, 24 subjects yielded an error rate of 20%. It seems that cues used by the speaker in condition I are not present to the same degree in condition 2, causing a breakdown in perfor- mance without the aid of contextual information. The results

support the hypothesis that in speech production there are ad- justments in cue formation according to speaking situation. We believe that on the listener's side, as well, in accordance with contextual aid, noise level, meaningfulness, familiarity, etc., there must be variable utilization of cues that are present in the speech wave. Results from extensive studies on cue vari- ability in production and perception are presented.

11:15

P10. Perception of suprasegmental and segmental features. Sara Garnes (Department of Linguistics, Ohio State University, Columbus, Ohio 43214)

This study reports the results of an experiment in which per- ception of a suprasegmental feature dominated perception of a segmental feature. The test language is Icelandic. Stimuli based on a minimal triple/ka :k a/ "cake,"/kak: a/ "keg (oblique) ," and/kahka/ "to heap up," were prepared on a terminal analog synthesizer. Variables include duration of V1, C2, and the absence or presence of four durations of pre- aspiration, [h]. If no preaspiration is present, subjects choose /kak: a/or/ka :ka/, depending on the ratio of V•/C 2 durations. If preaspiration is present, subjects choose/kahka/ or /ka: ka/, again only with favorable V•/C 2 ratios. The fact that the word/ka :kof is chosen, in spite of the presence of an otherwise misplaced segmental feature indicates that percep- tion of the suprasegmental feature length dominates perception of the segmental feature preaspiration. Evidence from studies

of production and perception of American English is discussed which supports the observation made here. A model of per- ception in which suprasegmental feature detection is warranted. Such a model--in the form of a decision tree--is

proposed.

11:30

Pll. Boundary. placement in connected speech. N. Umeda and M.O. Harris (Bell Laboratories, Murray Hill, New Jersey 07974)

Pitch is closely correlated with stress and boundary (pause and pseudopause). Stress and pause are not mutually indepen- dent; they often form an auditory unit in speech. Such units, we believe, must help listeners to grasp the grammatical and semantic framework of the message. Therefore, before we study fundamental frequency at the acoustic level, we chose to investigate the relation between sentence structure and the placement of boundaries. Two people, whose backgrounds in- clude linguistics and some musical training, listened care- fully to an essay read by four speakers, indicating boundariess where they perceived th•m. The listeners were concentrating more on overall perception of boundary than on its different acoustic characteristics. Analysis of preliminary data shows the listeners to be in 90% agreement with each other about the location of boundary reponses. Boundaries were perceived be- tween two successive words in these conditions: on 10% of all boundaries within the same phrase; on 50% of grammatical-unit boundaries in normal structural order, such as subject-verb; and on 70% of boundaries in reversed structural order, such as interruptions. Eighty percent of the boundary events noted by the listeners were heard as occurring in all four speakers.

11:45

P12. Characterization of fundamental-frequency contours of speech. S. Maeda (Department of Electrical Engineering and Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139)

Fundamental frequency (F 0) contours for 60 isolated Ameri- can English sentences and a text, read by three native speakers, have been analyzed and these contours have been schematized subjectively by ideal patterns. The basic ele- ments (attributes) of the schematic patterns are a baseline (BL) which represents the gradual fall of the F 0 contour along the sentence, a piecewise-linear trapezoidal pattern with rising (•) and lowering (L), and a peak (P) which often occurs at the onset of R. The trapezoidal pattern demarcates a sen- tence into phonetic groups (PGs) of words. Additional attributes are a rising (R1) which can occur at an intermediate content word in PG, and continuation rise. Intraspeaker differences for these schematic patterns were negligible for the sentences in- vestigated. The organization of the patterns seems to be con- strained by two primary factors: the structure of the sentence and a principle of physiological economy in the control of F 0 during speech. [Research supported by an NIH grant. ]

J. Acoust. Soc. Am., Vol. 56, Supplement

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Fri, 19 Dec 2014 06:42:00