thomas hoffmann (university of regensburg)
DESCRIPTION
Thomas Hoffmann (University of Regensburg). Corpus and Experimental Data as Corroborating Evidence: The Case of Preposition Placement in English Relative Clauses. Linguistic Evidence: Empirical, Theoretical, and Computational Perspectives University of Tübingen, 02.02.-04.02.2006. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/1.jpg)
Corpus and Experimental Data as Corroborating Evidence:The Case of Preposition Placement in English Relative Clauses
Linguistic Evidence: Empirical, Theoretical, and Computational Perspectives University of Tübingen, 02.02.-04.02.2006
Thomas Hoffmann
(University of Regensburg)
![Page 2: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/2.jpg)
1. Introduction: Corpus vs. Introspection
We do not need to use intuition in justifying our grammars, and as scientists, we must not use intuition in this way. (Sampson 2001: 135)
You don’t take a corpus, you ask questions. […] You can take as many texts as you like, you can take tape recordings, but you’ll never get the answer. (Chomsky in Aarts 2000: 5-6)
Which type of data are we left with then?
![Page 3: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/3.jpg)
1. Introduction: Corpus vs. Introspection
A corpus and an introspection-based approach to linguistics […] can be gainfully viewed as being complementary.
(McEnery and Wilson 1996: 16)
corpus and introspection data = corroborating evidence
case study: P placement in English Relative clauses
![Page 4: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/4.jpg)
1. Introduction: What to Expect
1. corpora vs. introspection?
2. categorical corpus data (ICE-GB corpus)
3. Magnitude Estimation experiment
4. variable corpus data (ICE-GB corpus)
5. conclusion
![Page 5: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/5.jpg)
2. Corpora and Introspection
Arguments against corpus data:
• “performance” problem:
• “negative data” problem:
• “homogeneity” problem:
“only use introspection”
![Page 6: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/6.jpg)
2. Corpora and Introspection
Arguments against corpus data: no corpus
• “performance” problem: yet: performance result of competence
modern corpora representative
• “negative data” problem: yet: only additional (different) data needed
• “homogeneity” problem:yet: empirical claim that needs to be investigated
use corpora + additional data type
![Page 7: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/7.jpg)
2. Corpora and Introspection
Arguments against introspection data:
• “unnatural data” problem:
• “irrefutable data” problem:
• “illusion” problem:
• “stability” problem:
“only use corpora”
![Page 8: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/8.jpg)
2. Corpora and Introspection
Arguments against introspection data: no introspection
• “unnatural data” problem:yet: only additional (context) data needed
• “irrefutable data”:yet: depends only on collection method
• “illusion” problem: yet: only additional (natural) data needed
• “stability” problem: yet: empirical claim that needs to be investigated
use corpora + additional data type
![Page 9: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/9.jpg)
2. Corpora and Introspection
Corpora and introspection are corroborating evidence:
= weaknesses of corpus data
= weaknesses of introspection data
+ ungrammaticality+ unexpected patterns
+ negative data+ contextual factors
+ rare phenomena+ natural language
introspectioncorpus
![Page 10: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/10.jpg)
3. Case Study: Preposition Placement
I want a data source ...
(1) a. which I can rely on [stranded preposition]
b. on which I can rely [pied-piped preposition]
driving question:data source for empirical analysis of (1a,b)?
![Page 11: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/11.jpg)
4. Empirical Study I: Corpus Data
• Corpus used:
International Corpus of English ICE-GB (Nelson et al. 2002)(educated Present-day BE, written & spoken)
• Analysis tool:
GOLDVARB computer programme (logistic regression; Robinson et al. 2001) relative influence of various contextual factors (weights: <0.5 = inhibiting factors; >0.5 = favouring)
![Page 12: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/12.jpg)
Pstrand/pied-piped token tested for
1. finiteness
2. restrictiveness
3. relativizer
4. XP contained in (V / N, e.g. entrance to sth. / Adj, e.g. afraid of sth.)
5. level of formality
6. X-PP relationship (Vprepositional, PPLoc_Adjunct, PPMan_Adjunct …)
except 2: all factors discussed in literature before, but not w.r.t. interdependence (e.g. Bergh, G. & A. Seppänen. 2000; Trotta 2000)
4. Empirical Study I: Corpus Data I
![Page 13: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/13.jpg)
raw ICE-GB P-placement data:
1074 finite relative clauses
659 (61.4%) tokens: pied piped
415 (38.6%) tokens: stranded
as expected: many categorical effects
accidental vs. systematic gaps?
4.1 Categorical corpus data
![Page 14: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/14.jpg)
1. relativizer:
all that/Ø-tokens in ICE-GB stranded
176 that+Pstranded-token
(2) a data source on that I can rely
177 Ø+Pstranded-token
(3) a data source on Ø I can rely
ICE-GB result: expected
implications: (2) = (3)? / that WH-
4.2 Categorical corpus data: that/Ø ≠ WH-relatives
![Page 15: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/15.jpg)
2. X-PP relationship:
Literature (e.g. Bergh, G. & A. Seppänen. 2000; Trotta 2000):
Pstranding favoured with complement PP
disfavoured with adjunct PP
ICE-GB data:
Pstranding restricted to PPs which
add thematic information to predicates/events
4.3 Categorical corpus data: Constraints on Pstrand
![Page 16: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/16.jpg)
2. X-PP relationship:
categorical effect of WH-PPAdjuncts-tokens:
a) just P+WH / no that/Ø+P in ICE-GB: manner, degree, frequency & respect PPs, e.g.:
(4) a. the ways in which the satire is achieved <ICE-GB:S1B-014 #5:1:A>
b. the ways which/that/Ø the satire is achieved in
4.3 Categorical corpus data: Constraints on Pstrand
![Page 17: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/17.jpg)
2. X-PP relationship:
categorical effect of WH-PPAdjuncts-tokens:
b) just P+WH / but that/Ø+P in ICE-GB: subcat. PP (put sth. in/into/under)
& locative, affected loc., direction PP adjuncts
(5) a. … the world that I was working in and studying in <ICE-GB:S1A-001 #35:1B>
b. … the world in which I was working and studying
4.3 Categorical corpus data: Constraints on Pstrand
![Page 18: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/18.jpg)
Claim: comparison of WH- vs that/Ø shows:
P can only be stranded if: PP adds thematic information to predicates/events
• manner & degree adjuncts:compare events “to other possible events of V-ing” (Ernst 2002: 59)
• frequency & respect adjuncts: have scope over temporal information (frequency) and truth value of entire clause (respect)
don’t add thematic participant Pstrand with these: systematic gap
4.3 Categorical corpus data: Constraints on Pstrand
![Page 19: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/19.jpg)
Claim: comparison of WH- vs that/Ø shows:
P can only be stranded if: PP adds thematic information to predicates/events
• subcat. PP & loc., affected loc., direction PP adjuncts:
add thematic participant WH+P with these: accidental gap
4.3 Categorical corpus data: Constraints on Pstrand
![Page 20: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/20.jpg)
Claim: comparison of WH- vs that/Ø shows:
P can only be stranded if: PP adds thematic information to predicates/events
Comparison of WH- vs that/Ø good evidence, but:still “negative data” problem
further corroborating evidence neededIntrospection: Magnitude Estimation study
4.3 Categorical corpus data: Constraints on Pstrand
![Page 21: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/21.jpg)
• relative judgements (reference sentence)
• informal, restrictive RCs tested for:
P-PLACEMENT (Pstrand, Ppied-piped)RELATIVIZER (WH-, that-, Ø-)X-PP (VPrep, PPTemp/Loc_Adjunct, PPManner/Degree_Adjunct)
• tokens counterbalanced: 6 material groups a 18 tokens + 36 filler = 54 tokens
• tokens randomized (Web-Exp-software)
• N = 36 BE native speakers (sex: 18m, 18f / age: 17-64)
5. Empirical Study II: Magnitude Estimation
![Page 22: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/22.jpg)
18 filler sentences: ungrammatical
a. That’s a tape I sent them that done I’ve myself (word order violation; original source: <ICE-GB:S1A-033 074>)
b. There was lots of activity that goes on there (subject contact clause; original source: <ICE-GB:S1A-004 #067>)
c. There are so many people who needs physiotherapy (subject-verb agreement error; original source: <ICE-GB:S1A-003 #027>)
5. Empirical Study II: Magnitude Estimation
![Page 23: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/23.jpg)
ANOVA: significant effects
• P-PLACEMENT: F(1,33) = 4.536, p < 0.05
• RELATIVIZER: F(2,66) = 17.149, p < 0.001
• P-PLACEMENT*X-PP: F(2,66) = 9.740, p < 0.001
• P-PLACEMENT*RELATIVIZER: F(2,66) = 4.217, p < 0.02
5. Empirical Study II: Magnitude Estimation
![Page 24: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/24.jpg)
ANOVA: not significant
• AGE: F(1,33) = 2.760, p > 0.10
• GENDER:F(1,33) = 1.495, p > 0.20
indicates: homogeneity of subjects
5. Empirical Study II: Magnitude Estimation
![Page 25: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/25.jpg)
Post-hoc Tukey test: P-Place*Relativizer
• Ppied-piped:WH- >> that [p < 0.001]WH- >> [p < 0.001]
that > [p < 0.010]
• Pstrand:no difference:WH- = that = [p >> 0.100]
5. Empirical Study II: Magnitude Estimation
![Page 26: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/26.jpg)
Post-hoc Tukey test: P-Place*X-PP
• Ppied-piped:PPMan/Deg > VPrep [p < 0.010]PPMan/Deg = PPTemp/Loc [p = 0.100]
VPrep = PPTemp/Loc [p > 0.100]
• Pstrand:no difference:VPrep > PPTemp/Loc > PPMan/Deg [p < 0.001]
5. Empirical Study II: Magnitude Estimation
![Page 27: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/27.jpg)
-2
-1,5
-1
-0,5
0
0,5
1
1,5
2M
ean
Ju
dg
me
nts
(z-
sco
res)
P+WH
P+That
P+0
prepositional verbs temp/loc adjuncts manner/deg adjuncts
Fig. 1: Magnitude estimation result for P + relativizer
P+WH >> P+that > P+Ø
![Page 28: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/28.jpg)
Fig. 2: Magnitude estimation result for P + relativizercompared with fillers
P+that & P+Ø = ungrammatical fillers violation of “hard constraint” (Sorace & Keller 2005)
-2
-1,5
-1
-0,5
0
0,5
1
1,5
2M
ean
Ju
dg
men
ts (
z-sc
ore
s)
P+WH
P+That
P+0
Filler (grammatical)
Filler (*Agree)
Filler(*ZeroSubj)
Filler(*WordOrder)
prepositional verbs temp/loc adjuncts manner/deg adjuncts
![Page 29: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/29.jpg)
-2
-1,5
-1
-0,5
0
0,5
1
1,5
2M
ean
Ju
dg
me
nts
(z-
sco
res)
WH+P
That+P
0+P
prepositional verbs temp/loc adjuncts manner/deg adjuncts
Fig. 3: Magnitude estimation result for relativizer + P
WH + P= that + P = Ø + PVPrep > PPTemp/Loc > PPMan/Deg
![Page 30: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/30.jpg)
-2
-1,5
-1
-0,5
0
0,5
1
1,5
2M
ean
Ju
dg
me
nts
(z-
sco
res)
X+P
Filler_Good
Filler(*Agree)
Filler(*ZeroSubj)
Filler(*WordOrder)
prepositional verbs temp/loc adjuncts manner/deg adjuncts
Fig. 3: Magnitude estimation result for relativizer + P
VPrep > PPTemp/Loc > PPMan/Deg >> ungrammatical filler violation of “soft constraint” (Sorace & Keller 2005)
![Page 31: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/31.jpg)
6. Corroborating Evidence
Corroborating evidence:
corpus: man/deg PPs: no Pstranded (not even with that/) semantic constraint on Pstranded
experiment:man/deg PPs worst environment for Pstranded yet: better than ungrammatical fillers
(soft constraint violation)
![Page 32: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/32.jpg)
Constraints on variable corpus data (354 finite WH-token):
Goldvarb identified 3 independent factors: (Log likelihood = -88.437 Significance = 0.004;
Fit: X-square(27) = 27.977, accepted, p = 0.2040)
1. level of formality (as expected)
2. type of PP contained in (as expected)
3. restrictiveness (unexpected): restrictive RC favour pied piping: (weight: 0.592)
nonrestrictive RC clearly inhibit pied piping (i.e. favour stranding; weight: 0.248)
7. Empirical Study III: Corpus Data II
![Page 33: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/33.jpg)
(6) And uhm he left me there with this packet of Durex which I hadn't got a clue what to do **[with]** to be totally honest <ICE-GB:S1B-049 #167:1:B>
reasons for restrictiveness effect:
1. weaker semantic ties of non-restrictive clause with antecedent (pause/comma)
2. Pied-piped P receives connective function
functionalisation of preposition placement in WH-relative clause
7. Empirical Study III: Corpus Data II
![Page 34: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/34.jpg)
corpus and introspection data = corroborating evidence:
corpora:frequency/context effects (e.g. level of formality)unexpected patterns (e.g. restrictiveness)categorical data require further investigation
introspection: differentiation of
accidental gaps (WH+P with PPTemp/Loc)systematic gaps (X+P with PPMan/Deg)detection of degrees of ungrammaticality
8. Conclusion
![Page 35: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/35.jpg)
9. References
Aarts, B. 2000. "Corpus linguistics, Chomsky and Fuzzy Tree Fragments". In Christian Mair and Marianne Hundt, eds. 2000. Corpus Linguistics and Linguistic Theory. Amsterdam and Atlanta, GA: Rodopi, 5-13.
Bard, E.G. et al. 1996. “Magnitude Estimation of Linguistic acceptability”. Language 72:32-68.
Bergh, G. & A. Seppänen. 2000. “Preposition stranding with wh-relatives: A historical survey”. English Language and Linguistics 4:295-316.
Cowart, W. 1997. Experimental Syntax: Applying Objective Methods to Sentence Judgements. Thousand Oaks: Sage.
Huddleston, R. et al. 2002. “Relative constructions and unbound dependencies”. In: G.K. Pullum & R. Huddleston, eds. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press, 1031-1096.
Jackendoff, R. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press.
Levine, R. & I.A. Sag. 2003. “WH-Nonmovement”. <http://www-csli.stanford.edu/~sag>, 04.07.2004.
![Page 36: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/36.jpg)
9. References
Nelson, G. et al. 2002. Exploring Natural Language: Working with the British Component of the International Corpus of English. Amsterdam, Philadelphia: Benjamins.
McEnery, T. and A. Wilson. 1997. Corpus Linguistics. Edinburgh: Edinburgh University Press.
Pesetsky, D. 1998. “Some principles of sentence production”. In: Pilar Barbosa et al., eds. Is the Best Good Enough? Optimality and Competition in Syntax. Cambridge, MA: MIT Press, 337-83.
Penke, M. & A. Rosenbach. 2004. "What counts as evidence in linguistics? An introduction". Studies in Language 28,3: 480-526.
Pickering, M. & G. Barry. 1991. “Sentence processing without empty categories”. Language and Cognitive Processes 6:229-259.
Quirk, R. et al. 1985. A Comprehensive Grammar of the English Language. London: Longman.
Robinson, J. et al. 2001. “GOLDVARB 2001: A Multivariate Analysis Application for Windows”. <http://www.york.ac.uk/depts/lang/webstuff/goldvarb/manualOct2001>
![Page 37: Thomas Hoffmann (University of Regensburg)](https://reader036.vdocuments.net/reader036/viewer/2022062321/5681402f550346895dab90bd/html5/thumbnails/37.jpg)
9. References
Sag, I.A. 1997. “English relative constructions”. Journal of Linguistics 33:431-484.
Sampson, G. 2001. Empirical Linguistics. London, New York: Continuum.
Schütze, Carson T. 1996. The Empirical Base of Linguistics: Grammaticality Judgements and Linguistic Methodology. Chicago: Chicago University Press.
Sorace, Antonella and Frank Keller. 2005. "Gradience in linguistic data". Lingua 115,11: 1497-1525.
Trotta, J. 2000. Wh-clauses in English: Aspects of Theory and Description. Amsterdam and Philadelphia, GA: Rodopi.
Van der Auwera, J. 1985. “Relative that — a centennial dispute”. Journal of Linguistics 21:149-179.