adv1 and adv1revised - diva-portal.org1398038/fulltext01.pdf · corpus methodology has made new...

34
This is an author produced version of a paper published in International Journal of Corpus Linguistics. This paper has been peer-reviewed but does not include the final publisher proof- corrections or journal pagination. Citation for the published paper: Magnus Levin, Hans Lindquist, “Like I said again and again and over and over: On the ADV1 and ADV1 construction with adverbs of direction in English” 2013, vol 18, issue 1: pp 7-34 URL: http://dx.doi.org/10.1075/ijcl.18.1.04lev Publisher: John Benjamins, http://benjamins.com/#catalog/journals/ijcl/main Access to the published version may require subscription.

Upload: others

Post on 17-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

This is an author produced version of a paper published in International Journal of Corpus Linguistics. This paper has been peer-reviewed but does not include the final publisher proof-corrections or journal pagination. Citation for the published paper: Magnus Levin, Hans Lindquist, “Like I said again and again and over and over: On the ADV1 and ADV1 construction with adverbs of direction in English” 2013, vol 18, issue 1: pp 7-34 URL: http://dx.doi.org/10.1075/ijcl.18.1.04lev Publisher: John Benjamins, http://benjamins.com/#catalog/journals/ijcl/main Access to the published version may require subscription.

Page 2: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

Like I said again and again and over and over: 1

On the ADV1 and ADV1 construction in English1 2

Magnus Levin, Linnaeus University 3

Hans Lindquist, Malmö University 4

5

6

1 We would like to dedicate this paper to Michael Stubbs, who with his theoretical astuteness and methodological ingenuity has served as source of inspiration for our work over many years, during which he has also been a supportive friend.

Page 3: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

2

1

2

1. Introduction: The ADV1 and ADV1 construction 3

4

The role of multi-word units in language and their relevance for linguistic theory has attracted 5

increasing interest over the past three decades (cf., e.g., Pawley & Syder 1983; Altenberg 6

1998; Moon 1998; Sinclair 1999; Wray 2002; Stubbs 2002; Ellis 2008; Granger & Meunier 7

2008; Gries 2008; Biber 2009). Corpus methodology has made new types of empirical 8

investigations possible and has moved the focus from infrequent idiomatic expressions to 9

frequently recurring word combinations. A large number of labels have been proposed to 10

denote such multi-word units of more or less overlapping types (see Wray 2002:8−10 for a 11

comprehensive list); in the present study we will follow Wray and use the term ‘formulaic 12

sequence’ and, occasionally, the term ‘construction’ in a non-construction-grammar sense. 13

Stubbs (2002) discusses two types of multi-word units: collocations of content words 14

and “phrases which consist of a combination of grammatical and content words” (p. 227).2 15

The latter can be described by formulae like “PREP the NOUN of the” (ibid. 232) and 16

illustrated by examples like at the end of the. In a later paper, Stubbs distinguishes between 17

“phrase frames”, which are n-grams with a variable slot like plays a * part, and “PoS-grams”, 18

which are strings of part of speech categories like preposition + determiner + singular noun + 19

preposition + determiner (2007a:90−91). The units studied in the present paper, sequences 20

like over and over, do not fit comfortably into any of these categories since they consist of 21

only grammatical words. For the same reason, they are neither collocations nor colligations in 22

the terminology of Sinclair (1998). If anything, they can be said to be a special kind of PoS-23

gram with the form ADV1 and ADV1, i.e. a combination of two identical adverbs 24

coordinated by the conjunction and. They belong to a type of sequences consisting of 25

symmetrical structures with two items of the same word class joined by some linking word 26

(cf. Lindquist & Levin 2009:173–174). There seems to be a tendency in English and many 27

other languages to favour such sequences. In an article on the PoS-gram “noun-preposition-28

noun” (in his construction grammar terminology: the NPN construction), Jackendoff 29

(2008:15) suggests that such constructions are “cognitively natural”. In the same article, he 30

mentions a “family of idioms”: back and forth, up and down, to and fro and round and round, 31

2 Biber (2009:275) makes the same distinction between two types of multi-word sequences: “‘multi-word lexical collocations’ (combinations of content words)” and “‘multi-word formulaic sequences’ (incorporating both function words and content words)” which he sees as the two poles of a cline (ibid. 290).

Page 4: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

3

which contain combinations of either identical or different adverbs, joined by the conjunction 1

and (ibid. p 12). The present paper focuses on combinations with identical adverbs, i.e. the 2

ADV1 and ADV1 construction, disregarding for the time being sequences of different 3

adverbs like back and forth etc. 4

The ADV1 and ADV1 construction is theoretically interesting for several reasons. As 5

pointed out by Stubbs (2007b:165) about phrases in general, “[t]he semantic denotation of the 6

individual word is weakened and the pragmatic meaning of the multiword-sequence is 7

strengthened”. The sequences under study illustrate a number of typical paths of such 8

semantic/pragmatic change from concrete to abstract – textual or subjective – meanings (cf. 9

Brinton & Traugott 2005; Traugott 1995; Traugott & Dasher 2005). Through the use of 10

historical corpora, we are able to study these diachronic processes and establish paths of 11

change for a number of instantiations of the construction. We can also study the relation 12

between frequency and linguistic change (cf. Barlow 2011:24−25). Furthermore, our findings 13

shed some light on the interplay between speakers’ application of grammatical rules and their 14

use of ready-made phrases, and thus on the question of what constitutes the units of the 15

language (cf. Stubbs 2007a:90), and what is the role of the lexicon (cf. Wray 2002:261−265). 16

Finally, we are able to study the influence of register on various developments. In particular, 17

we are looking for answers to the following questions: 18

19

• What syntactic roles do the sequences fulfil and how have these roles developed 20

over time? 21

• How have the frequencies of the most frequent ADV1 and ADV1 sequences in 22

English changed over time? 23

• What kind of meanings do they express and how have these meanings developed 24

over time? 25

• How are these changes related to differences between formal/informal and 26

written/spoken registers? 27

• To what extent can these ADV1 and ADV1 sequences be said to form a set or a 28

“family” of constructions which have similar form and meaning and display similar 29

developments? 30

31

32

2. Material and method 33

Page 5: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

4

1

Our main data sources are The Corpus of Historical American English (COHA) (Davies 2

2010–) and The Corpus of Contemporary American English (COCA) (Davies 2008–); for 3

discussions and explorations of COCA, see Davies (2009, 2010). In addition, we make some 4

comparisons with The British National Corpus (BNC), The Longman Spoken American 5

Corpus (LSAC) and the new 155-billion-word Google Book Corpus covering 1810–2009 6

(Davies 2011–). 7

COHA contains 400 million words from the years 1810−2009. It enables researchers to 8

carry out diachronic studies of rare phenomena, but due to its composition the corpus has 9

some limitations. It contains four different genres, Fiction, Magazines, Newspapers and Non-10

Fiction. The composition of the sub-corpora is not, however, fully consistent over time. The 11

fiction sub-corpus mainly contains novels (from Moby Dick to Hollywood Wives), short 12

stories, and drama, but also, for the 19th century, poetry, and, for the 20th century, movie 13

scripts (from Singin’ in the Rain to Psycho). The magazine sub-corpus contains a wide variety 14

of sources from Harpers, The Nation, Sports Illustrated to Time Magazine. About half the 15

newspaper material comes from The New York Times, and an additional 10% each from The 16

Chicago Tribune and The Christian Science Monitor. This focus on rather few newspapers 17

means that findings from the newspaper sub-corpus must be treated with some caution. The 18

biggest categories of non-fiction books include history, social sciences, science and 19

technology. 20

The genre distribution in the corpus varies over time, which makes diachronic 21

comparisons slightly problematic. The proportion of fiction texts remains around 50%, while 22

the coverage of newspapers and non-fiction is more restricted. Newspapers only occur in 23

COHA from the 1860s and only constitute around 7% of the corpus size per decade until the 24

1920s. Thus, overall findings from the corpus are largely based on the language of fiction. 25

Comparisons across the whole span of the corpus are also affected by the fact that the 1810s 26

sub-corpus only contains around one million words. Some of our comparisons over time 27

therefore had to be restricted to the 1850s onwards. 28

COCA is continuously updated and in the version we used contained 425 million words 29

from 1990−2011. This corpus comprises roughly equal proportions of Spoken, Fiction, 30

Magazines, Newspapers and Academic journals. The spoken sub-corpus contains material 31

from news broadcasts and talk shows, such as Good Morning America, 60 Minutes and Jerry 32

Springer. Because there is no spontaneous conversation (see, however, under Texts/types at 33

http://corpus.byu.edu/coca/ for a discussion of the naturalness and authenticity of the spoken 34

Page 6: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

5

data), we also compare with LSAC. This is a corpus of spontaneous conversations from the 1

1990s of about 5 million words (Leech et al. 2009:xxi, 100). The Fiction sub-corpus in COCA 2

consists of short stories from literary magazines, first chapters of novels, and movie scripts. 3

There are about 100 different titles in the magazines sub-corpus, ranging from Cosmopolitan 4

and Christian Century to Sports Illustrated. The newspaper sub-corpus is sampled from ten 5

different publications, such as The New York Times and The San Francisco Chronicle. 6

Finally, the Academic sub-corpus contains about 100 different peer-reviewed journals, such 7

as American Scholar and Journal of Environmental Health. 8

The BNC (Aston & Burnard 1998) was used to compare the American material with 9

British English from the 1990s. For this we utilized Mark Davies’ interface 10

(http://corpus.byu.edu/bnc/). 11

As will be seen in the next section, we began our survey of the ADV1 and ADV1 12

construction by searching COCA and the BNC for sequences with two adverbs coordinated 13

by and. We then searched for the most frequent sequences in the COHA material limiting our 14

investigation to five different decades, the 1810s, 1850s, 1900s, 1950s and 2000s. 15

16

3. Results 17

3.1 Distribution 18

3.1.1 Overall frequency 19

20

Table 1 presents an overview of the twelve most frequent sequences of identical adverbs of 21

direction or position in the COCA. Frequencies for the BNC are supplied as a comparison. 22

Sequences of two identical comparatives like more and more (2870 tokens per 100 million 23

words in COCA), less and less (315), faster and faster (109), further and further (81) etc. 24

have been left out. Similarly, so and so has been left out since its function is very different 25

from that of the others. The inclusion of again and again could perhaps be questioned since 26

in Present-day English again normally expresses neither direction nor position, but it is 27

motivated by the fact that the original meaning of again was “[i]n the opposite direction; 28

back” (OED s.v. again A. adv. 1.a.), and an even stronger motivation is that again and again 29

functions very much like over and over, as will be seen below. 30

31

Table 1. The twelve most frequent sequences of identical adverbs of direction or position 32

in COCA and the BNC. Sequences of two identical comparatives and so and so excluded. 33

Frequency per 100 million words. 34

Page 7: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

6

Rank

in

COCA

COCA per

100 million

words

Phrase BNC per

100 million

words

Rank in

the BNC

1 1719 over and over 540 2

2 1027 again and again 649 1

3 699 on and on 395 3

4 122 around and around 14 9

5 96 round and round 109 4

6 80 up and up 75 5

7 60 through and through 63 6

8 29 by and by 2 12

9 28 down and down 38 7

10 22 out and out 31 8

11 16 back and back 8 11

12 2 in and in 11 10

Total 3900 1935 Raw figures. Instances of triplication like over and over and over are here counted twice and instances of 1 hesitation in spoken language have not been disregarded. However, instances where the sequence is not a 2 coordination of two adverbs, like (…) before we did that we had to put the natural plug in and in that plug there 3 were five like aluminium coloured er pins (…), have been disregarded. 4 5 The frequencies in Table 1 can be compared with the cut-off frequency used by Biber et al. 6

(1999:990) in their studies of so-called lexical bundles (i.e. any recurring word strings), which 7

was 10 occurrences per one million words (i.e. 1,000 occurrences per 100 million words). 8

With that criterion, only over and over and again and again in the American corpus would 9

qualify. Nevertheless, it is possible to discern a set or family of similar ADV1 and ADV1 10

sequences where the top ones have a fairly high frequency and the rest are related by having 11

the same form. Except for minor variations, the top-ranking sequences are the same in 12

American and British English. The most striking difference is that the un-hyphenated 13

sequence by and by only occurs twice in the BNC (spelt bye and bye), together with two 14

instances with hyphens. Hyphenated instances are not included in the table, but for some of 15

these sequences there are a fair number of hyphenated tokens in COCA: up-and-up 7 per 100 16

million words, through-and-through (3), out-and-out (10), by-and-by (3). We will return to 17

the question of hyphenation below. 18

Page 8: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

7

The difference in frequency between around and around and round and round in the 1

two varieties can be explained by the well-known regional differences between the single 2

words around and round (see, however, Section 3.5 below for a modification of this view). 3

Table 1 also shows that three of the sequences are considerably more frequent in American 4

English than in British English: over and over, and, to a lesser extent, again and again and on 5

and on. We will return to this fact below. As will also be shown below, over time many of 6

these sequences have undergone semantic developments and have acquired specialized and 7

partially overlapping meanings, which in turn may have influenced frequency. 8

In the remainder of the paper, we will focus on the most frequent sequences and those 9

that show the most interesting developments. 10

11

3.1.2 Frequency over time 12

13

The frequency of the ADV1 and ADV1 sequences has changed notably over time. The 14

developments for a number of them are illustrated in Figure 1. 15

16

0

5

10

15

20

25

1810s 1850s 1900s 1950s 2000s

again and again on and on over and over

around and around round and round through and through

by and by up and up out and out

dow n and dow n 17

Figure 1. The distribution of again and again, on and on, over and over, by and by, 18 (a)round and (a)round, through and through, up and up, out and out and down and down 19 in COHA, per million words3 20 3 Unless otherwise specified, all numbers in figures and tables have been post-edited to exclude irrelevant tokens. ‘Multiple’ tokens (on and on and on) have been counted as one instance each. Hyphenated instances are

Page 9: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

8

1

As is shown in Figure 1, three of the sequences increased significantly during the period: over 2

and over, again and again (peaking in the 1900s) and on and on. Around and around also 3

rose, but at a much lower level. At the same time, several sequences decreased in frequency: 4

round and round (after having peaked in the 1850s) and through and through (after having 5

peaked in the 1900s). Perhaps the most striking development, however, is seen with by and 6

by. This sequence is quite frequent in the mid- (to late) 19th century, but already in the early 7

1900s it had started a steep decline and by the mid-20th century it had virtually disappeared 8

from use.4 9

These results do not appear to be based on random fluctuations in COHA, since similar 10

changes over time were found in the Google Book Corpus: Over and over, on and on and 11

around and around have been slowly increasing during the last two centuries, while again 12

and again, by and by, round and round and through and through all peak around 1900 and 13

then decrease. The Google Book Corpus can thus serve to support or disconfirm hypotheses 14

from other corpora. 15

LSAC provides some additional support for these developments. Again and again, 16

which appears to have been decreasing for a century, only occurs 1.2 times pmw (6 tokens). 17

On and on and over and over, which both are still on the increase in COHA, are more 18

frequent than again and again, but over and over is rarer in LSAC (9.8 pmw; 49 tokens) than 19

in COHA (17.5 pmw) and on and on is on a roughly similar level, 7.8 pmw (39 tokens) in 20

LSAC as compared to 7.2 pmw in COHA.5 21

We thus see that the individual sequences have individual histories, and beginning with 22

section 3.2 below the most interesting will be dealt with one by one. We will then return to a 23

discussion of more generalized aspects in the Discussion section. 24

included for by and by (and bye and bye) and out and out. In and in is not included in Figure 1 since there were only 6 tokens of this sequence in the relevant decades (all occurring in the 1850s). 4 The decrease of by and by coincides with frequencies for other adverbials meaning ‘soon’ (before long, presently, soon) in COHA decreasing by 50%. Instead, some adverbials meaning ‘in the end’ (eventually, finally, in the end) have more than trebled in frequency. The demise of by and by therefore seems partly explicable in terms of a general change in the use of temporal adverbials in English. The reasons behind this shift require further study. 5 LSAC also supports the findings for the other adverbs: Round and round and around and around are roughly equally frequent in LSAC (in spite of the single around being many times more frequent than round in that corpus, as in most corpora). Through and through (3 tokens), by and by (0 tokens), out and out (2) and up and up (3) were, as could be expected, very rare in the spoken corpus. Comparisons between LSAC and the Spoken sub-corpus of COCA show that the less spontaneously produced speech from COCA has consistently higher frequencies for all three sequences: again and again (10.5 pmw), on and on (12.5 pmw) and over and over (25.0 pmw).

Page 10: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

9

1 3.2 Over and over 2

3

One of the definitions of over given by the OED (s.v. over III.7b) is: “So as to make what was 4

an upper surface a lower one, by turning it forward (or laterally) and downwards, or to turn 5

something upside down. Also reduplicated to indicate repetition of such motion, as over and 6

over.” The OED also notes the extended, metaphorical meaning (s.v. over III.15): “repeatedly, 7

many times over. Also over and over again.” In our material there is a difference between 8

over and over and over and over again in that motion verbs do not occur with over and over 9

again, cf. (1) and (1a). 10

11

(1) There were three large coins and one smaller one. He turned them over and over, and 12

finally ascertained that the large coins were ten dollar pieces, and the smaller one a five 13

dollar piece. (COHA; Fiction; 1856) 14

(1a) *he turned them over and over again 15

16

In contrast with again in the sequence again and again (see below), the adverb over retains a 17

concrete meaning component denoting motion or change of position in space even when the 18

sequence occurs with the meaning ‘again’. The connection with physical motion, however, 19

has weakened over time. This is illustrated in Figure 2, which shows the proportion of motion 20

verbs used with over and over in COHA over the last 150 years. 21

22

23

0

5

10

15

20

25

1850s 1900s 1950s 2000s

24 Figure 2. The percentage of motion verbs with over and over in COHA 25

26

Page 11: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

10

As Figure 2 shows, motion verbs, in this case almost exclusively TURN and ROLL, decrease 1

steadily in frequency over the period.6 2

In a parallel development, the use of again after over and over as in (2) has decreased, 3

so that the most common usage is now over and over without again, as in (3).7 4

5

(2) “My son -- my dearest son,” she said, over and over again, […]. (COHA; Fiction; 1859) 6

(3) She said Goddammit, goddammit, goddammit, over and over, like that. (COHA; 7

Fiction; 2000) 8

9

This trend is illustrated in Figure 3. 10

11

12

0

10

20

30

40

50

60

70

80

90

100

1850s 1900s 1950s 2000s

13 Figure 3. The percentage of over and over followed by again in COHA. Motion verbs 14

excluded.8 15

16

There is no indication that again is lost faster in collocations with low-frequency verbs than in 17

collocations with high-frequency ones (there was instead a non-significant trend suggesting 18

6 The differences between the 1900s, 1950s and 2000s are statistically significant (chi-square test, p ≥ 0.05). The figure is based on the following numbers: 1850s 20.3% (28 motion verbs in 138 tokens), 1900s 13.8% (44/317), 1950s 8.2% (27/330) and 2000s 4.6% (24/518). 7 One of our reviewers has pointed out that this is reminiscent of Jespersen’s cycle of sentence negation, which indeed it is in the sense that the adverb originally carrying the meaning, again, disappears and only the intensifying modifier, over and over, remains. 8 All changes in Figure 6 are significant (p ≥ 0.05), except between the 1900s and the 1950s. The figure is based on the following numbers: 1850s 79.1% (87/110), 1900s 44.0% (120/273), 1950s 40.6% (123/303) and 2000s 29.6% (146/494).

Page 12: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

11

that again is lost slightly more rapidly with high-frequency verbs).9 Bybee (2006:728–729) 1

argues that high frequency constructions (such as no negation) are resistant to change. 2

However, in the present case the change involves a reduction in form, and it seems likely that 3

the reduction can proceed faster in contexts where readers readily recognize the collocation 4

(as is the case for phonological reduction, see Bybee 2006:714–715). 5

American English is leading the way in the dropping of again after over and over. The 6

distribution over genres in COHA can be compared with that in the more recent corpora 7

COCA and BNC presented in Figure 4. 8

9

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

COCA-Spoken

COCA-Fiction

COCA-Magazine

COCA-Newspaper

COCA-Academic

BNC-Spoken

BNC-Fiction

BNC-Magazine

BNC-Newspaper

BNC-Non-Academic

BNC-Academic

BNC-Misc

again Ø 10

Figure 4. Over and over followed by again in the different genres in COCA and the 11

BNC10 12

13

Perhaps contrary to expectation, the highest proportion of again following over and over is 14

found in the Spoken and Academic sub-corpora in COCA and the BNC, while the lowest is in 15

Fiction. Furthermore, in spontaneous AmE conversation from LSAC, 49% (24/49) of the 16

instances of over and over are followed by again, which suggests that the loss of again is not 17

an effect of increasing colloqualization. One possible explanation for the retention of the 18

longer form in the spoken language is that wordiness is more acceptable there than in written 19

genres, where more succinctness is expected. The COCA and BNC figures are still puzzling, 20 9 The top five verbs (SAY, REPEAT, TELL, DO, PLAY) retained again in 22% of the instances (32/145) in the 2000s as compared to 32% (114/351) with the less frequent verbs. 10 It was not possible to go through the nearly 8,000 tokens manually to identify motion verbs (which do not permit variation with again) and multiple instances (over and over and over (again)). Figure 8 is therefore based on raw numbers.

Page 13: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

12

however, since it is usually Fiction that is most spoken-like, and the high proportion of again 1

in Academic is surprising. 2

To conclude, in approximately 60% of the cases in COCA, the full meaning of 3

‘repetition’ is expressed by over and over without the added again. In the BNC, on the other 4

hand, 61% (337/555) of the cases retain again. This suggests that over and over is becoming 5

synonymous with again and again in American English; it may even be on its way to take 6

over the earlier role of again and again. This hypothesis is supported by the frequency figures 7

in Table 1 above, where it was shown that while again and again is still the most frequent 8

sequence in British English, over and over is now the most frequent one in American 9

English.11 10

11

12

3.3 Again and again 13

14

In the sequence again and again, a word denoting ‘repetition’ is itself repeated. The OED 15

(s.v. again A.4a) gives the meaning of again as “Repetition of an action or fact: another time; 16

once more; any more; anew; […]”, and notes that “[t]his sense is more fully expressed by 17

once again, over again; and the repetition increased by too and again (obs.), again and 18

again, ever and again, time and again.” (ibid. 4b).12 19

In order to get at the meaning and possible changes in meaning of the adverb sequences 20

over and over and again and again we investigated with which verbs they were used and how 21

freely they were used with different verbs. The similarity in meaning between over and over 22

and again and again is indicated by the fact that they are used with very similar verb types. 23

Diachronically, the type of verbs occurring most frequently with both adverbs has remained 24

stable. In the 1900s, 6 of the 1113 top verbs with again and again also occurred with over and 25

over (KISS, REPEAT, GO, SAY, READ, TURN), and the same was the case in the 1950s (although 26

for partially different verbs: SAY, HEAR, REPEAT, TELL, GO, ASK). In the 2000s, the proportion 27

was 7/11 (TELL, REPEAT, SAY, DO, ASK, GO, READ). 28

11 Another, less frequent, alternative to again and again and over and over (again), without repetition of any adverb, is over again: I said that during the campaign. And I’ve said it over again as governor (COHA; Spoken; 2011). Usually, (all) over again has the slightly different meaning ‘anew’: It was like learning to fly all over again (COCA; Fiction; 2011). 12 Of the alternatives mentioned by the OED, too and again is indeed obsolete, only occurring in one text from 1923 in COHA, ever and again only occurs between 1847 and 1947 and time and again is infrequent and decreasing slightly in the 20th century. 13 In the 1900s, four verbs were in joint eighth position in the order of frequency, and in the 2000s, two verbs shared tenth place. Hence the inclusion of the top eleven collocates.

Page 14: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

13

An analysis of the sequences in context indicates small differences in preferences 1

between the adverbs and a high degree of interchangeability, as suggested by (4) and (5). 2

3

(4) Mother had told me again and again not to be angry with him; (COHA; Fiction; 2000) 4

(5) I must not feel guilty, Mama and Papa told me over and over, in the days afterward, 5

[…]. (COHA; Fiction; 2006) 6

7

The COHA material suggests one minor, but significant, difference between the two adverbs: 8

negative connotations such as dislike, annoyance, or sarcasm as in (6) are significantly (p ≥ 9

0.05) more often expressed with over and over (50% (260/518)) than with again and again 10

(41% (151/369)).14 11

12

(6) Oh, Mark. It’s the same old thing, over and over. We need something new. (COHA; 13

Fiction; 2006) 14

15

Even with motion verbs like TURN, the adverbs are used with similar meaning, as in (7). 16

17

(7) Dotty groaned in her sleep and turned again and again. (COCA; Fiction; 2000) 18

19

There is little difference between again and again and over and over when it comes to 20

collocating verbs, and they are both most common in fiction. The differences found in COHA 21

are instead restricted to proportions in different genres – again and again being more than 22

twice as frequent in fiction than any other genre while over and over is only slightly more 23

common in fiction – and to diachronic developments: over and over appears to be slowly 24

replacing again and again. 25

26

3.4 On and on 27

28

One of the meanings given for on in the OED (s.v. on A.4.a.) is: “Onward or forward in 29

space, time, or condition. […] Sometimes repeated for emphasis, as on and on […]”. For on 30

14 In the present study negative attitudes were either deduced from the immediate context such as the co-occurrence of a verb expressing negative connotations (She looked almost like an angel, that is if angels had braided hair and prattled on and on. (COHA; Fiction; 2009)), but often wider contexts such as the sentence or paragraph needed to be taken into account.

Page 15: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

14

and on, the OED gives (s.v. on A.5.b.): “at length; again and again, continually, eternally; 1

now freq. depreciatively, esp. with verbs indicating speech: interminably, incessantly.” The 2

OED thus acknowledges that there has been a semantic development from denoting position 3

or motion in physical space to denoting extension in time and even repetition (although on 4

and on normally refers to durative action and again and again to punctual repetition), and 5

furthermore notes that the sequence has acquired negative connotations. In this process, the 6

number of different verbs used with on and on has changed, so that although the number of 7

tokens increased steadily during the period, the number of types reached a peak with 56 in the 8

1900s and then decreased to 37 in the 2000s. During the same period, tokens with motion 9

verbs decreased and tokens with the verb GO increased. 10

Typical motion verbs in earlier usage were, e.g., GO, RIDE, and WALK, as in (8). In the 11

analysis, GO was classified as a motion verb in those cases where physical motion was evident 12

(common until 1900s), as in (9). 13

14

(8) As he rode on and on remorse drew him into his grasp. (COHA; Fiction; 1909) 15

(9) On and on they went, making slow progress because the trail was very poor. (COHA; 16

Fiction; 1909) 17

18

The decrease in motion verbs and the increase in the verb GO is made evident in Figure 5. 19

20

0

10

20

30

40

50

60

70

1850s 1900s 1950s 2000s

go Motion verbs 21

Figure 5. The percentage of motion verbs and the verb GO with on and on in COHA15 22

23

15 The numbers for GO in the four decades were 2, 30, 87 and 120, and the numbers for motion verbs were 12, 79, 23 and 17.

Page 16: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

15

The increase in the use of GO fits the typical S-curve often seen in language change, while the 1

motion verbs follow the corresponding downwards trend. 2

In conjunction with the decreasing number of motion verbs and increasing use of the 3

verb GO, there has been an increase in negative connotations expressed with on and on. This 4

is illustrated in Figure 6, where the differences between the 1900s, 1950s and 2000s are 5

significant (p ≥ 0.05). 6

7

0

10

20

30

40

50

60

70

80

1850s 1900s 1950s 2000s

negative 8

Figure 6. Negative connotations with on and on in COHA16 9

10

In the 2000s, GO on and on either refers to negative events continuing incessantly, as in (10), 11

or to speech events that, according to the speaker, continue for too long as in (11). 12

13

(10) When he fell ill, the suffering went on and on. (COHA; Fiction; 2001) 14

(11) […], once Marv gets going about his car, it’s downright pain-in-the-arse material. He 15

goes on and on, like a kid, and he’s just turned twenty, for Jesus’ sake. (COHA; Fiction; 16

2006) 17

18

Apart from GO, on and on also collocates with some (often negative) verbs of communication, 19

e.g. DRONE, TALK and PRATTLE. 20

On and on differs from again and again and over and over in that it normally occurs 21

with verbs construed as durative as in (12), while the latter occur with verbs construed as 22

punctual, as in (13) and (14). 23

24

16 The numbers for negative connotations were the following: 1850s 15.8% (3/19), 1900s 18.4% (25/136), 1950s 54.2% (90/166) and 2000s 76.2% (163/214).

Page 17: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

16

(12) […] the Salvation Army band could be heard still playing cheerfully on and on. 1

(COHA; Fiction; 1953) 2

(13) Basically, Brendan’s style is to play that Missy Elliott album again and again until I 3

want to kill him. (COHA; Magazine; 2003) 4

(14) Sometimes even horrible memories play over and over in my mind, […]. (COHA; 5

Fiction; 2003) 6

7

In conclusion, on and on has developed two new kinds of meaning: durative and textual. In 8

addition it has gone through subjectification, expressing mainly negative attitudes on the part 9

of the speaker. 10

11

3.5 (A)round and (a)round 12

13

As mentioned in the introduction, the adverb forms around and round are variants in more or 14

less free variation. The same seems to be the case for the sequences round and round and 15

around and around, which will therefore be treated together here. According to many sources, 16

around is generally preferred in American English and round in British English (cf., e.g., 17

OED s.v. round adv. and prep.). However, Baker (2011:77−80) has shown that in British 18

English as represented by the Brown family of corpora, the total frequency of around 19

overtook round between 1961 and 1991, and then increased very fast between 1991 and 2006. 20

It is noteworthy, however, that the sequence round and round is still much more frequent than 21

around and around in British English. 22

The preference in American English for around and around over round and round is 23

also quite recent: round and round was the most frequent form in COHA all the way up to the 24

1950s. It is only in the 2000s that around and around has prevailed in the American corpus,. 25

This small current preference for around and around should be compared with the overall 26

frequencies of the single adverbs around and round in COHA, where around is 25 times more 27

frequent than round in the 2000s. Adverbial round has thus been preserved to a greater extent 28

in the ADV1 and ADV1 construction than in the language as a whole both in British and 29

American English, which suggests that the older form of the adverb has been “fossilized” 30

inside the construction in the same way as obsolete words may live on in formulaic 31

sequences, as mentioned by Wray (2002:49). 32

Page 18: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

17

One of the meanings given by the OED for round is (s.v. round A.I.3.b.): “In a circular 1

or orbital course; so as to make a complete circuit. Freq. reduplicated to indicate repetition, as 2

round and round.” This sense is illustrated in (15). 3

4

(15) Jack could hear them muttering as they turned the strange black objects around and 5

around in their hands. (COHA; Fiction; 2004) 6

7

Our findings thus show that although the simple adverb around is increasing rapidly at the 8

expense of round, this process goes considerably slower in the sequence (a)round and 9

(a)round. The reason for such fossilization, Wray (2002:130) argues, is that native speaker 10

children engage in Needs Only Analysis in their acquisition process, meaning that they 11

analyse and segment linguistic material only if there is a good reason for it (rather than 12

segmenting as much material as possible). 13

14

3.6 Up and up 15

16

Up and up is usually used with non-figurative meaning, expressing continuous movement 17

upwards as in (16). This is the most frequent use of up and up. 18

19

(16) The baby turtle uses her flippers to climb up and up. (COCA; Magazine; 2010) 20

21

As noted in the OED (s.v. up n. 4a and 4b), up and up has also developed two metaphorical 22

meanings where up has been converted into a noun, as in (17) and (18). Of the 28 instances of 23

up and up in the 2000s in COHA, three occurred in this idiomatic meaning. 24

25

(17) Are you confident that the Colombian officials you dealt with are on the up-and-up and 26

you can trust them? (COCA; Spoken; 1990) 27

(18) Each article or book chapter counts for one point; each book counts for ten; and the 28

productivity trend, needless to say, is on the up and up. (BNC; Non-Fiction; 1990) 29

30

In (17), on the up-and-up means ‘honest’, while in (18) it means ‘rising’. In our present-day 31

American material, only the ‘honest’ meaning occurs, while both meanings can be found in 32

the BNC. Hyphenation is variable. 33

34

Page 19: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

18

3.7 Through and through 1

In its treatment of through and through, the OED gives the following explanation (s.v. 2

through prep. and adv. A.1.h.): “repeatedly through; so as to penetrate both sides or surfaces 3

of; right through, entirely through. Also fig. (Cf. B. 5.)”. It goes on: “With repeated or 4

complete penetration; through the whole thickness or substance; completely from beginning 5

to end; right through, entirely through” (ibid. B.5.a.). And further: “In all points or respects; 6

thoroughly, wholly, entirely, out and out” (ibid. B.5.b.). 7

With the sequence through and through there has been a shift from verbs expressing 8

physical motion or penetration (PIERCE, SHOOT, PENETRATE), as in (19), to the BE ADJ/NP 9

through and through construction expressing either positive or negative attitude, as shown in 10

(20). In the COHA material there is a significant shift (p ≥ 0.05) from physical motion use to 11

abstract extension between the 1850s (93% (37/40) motion verbs) and the 1900s (34% (20/59) 12

motion verbs). 13

14

(19) I’ll mount my horse and ride both day and night Till I find out the brute, and then this 15

sword Shall stab him through and through. (COHA; Fiction; 1815) 16

(20) He’s poisoned the child with his crazy talk. Anti-American through and through. 17

(COHA; Fiction; 1958) 18

19

Syntactically, through and through can either postmodify an adjective, as in (20), or an NP, as 20

in (21). 21

22

(21) […] that guy is not only rich and spoiled, he’s a bad apple through and through. 23

(COHA; Fiction; 2003) 24

25

In Present-day English this sequence typically expresses opinions about another person’s 26

character. It thus instantiates a shift from concrete to abstract meaning combined with speaker 27

attitude towards the degree or extent of a particular phenomenon. 28

29

3.8 By and by 30

31

By and by is the only sequence in our material that only occurs in a figurative sense, described 32

by the OED (s.v. by and by adv. and noun A. adv. 4) as: “Before long, presently, soon, 33

Page 20: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

19

shortly.” The original, now obsolete meaning was, however, the non-temporal “Of a 1

succession of (persons or things): One by one, one after another, in order” (OED s.v. by and 2

by adv. and noun A. adv. 1). An example of the current temporal meaning is given in (22). 3

4

(22) We’ll understand it all by and by. (COCA; Magazines; 2011) 5

6

As noted in the OED, there is also a nominalized form of by and by, ‘time coming’, now 7

mainly used in the form the sweet by and by17 as in (23). 8

9

(23) No, you know − God will take care of the sweet by and by. (COCA; 2006; Spoken) 10

11

As mentioned in Section 3.1.2, by and by has almost totally gone out of use. 12

13

3.9 Out and out 14

15

In addition to relatively rarely expressing continued concrete or abstract outward movement, 16

as in (24), the sequence out and out has developed the figurative meanings ‘thoroughly, 17

completely, entirely; downright’ as an adverb, and ‘complete, thoroughgoing, unqualified’ as 18

an adjective (OED s.v. out and out). These latter and more frequent uses are illustrated in (25) 19

and (26). 20

21

(24) Out and out and out the blast spreads. (COCA; Magazines; 2004) 22

(25) And sometimes parents just out and out doubt the teacher’s veracity. (COCA; 23

Magazines; 2003) 24

(26) That is all an out and out lie and apparently his testimony has not been shaken. (COCA; 25

Spoken; 1996) 26

27

In the figurative sense, out and out is normally used to express the speaker’s negative attitude 28

to the action or person etc. described. 29

30

17 From the Christian hymn “The Sweet By-and-By” with lyrics by S. Fillmore Bennett and music by Joseph P. Webster)

Page 21: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

20

3.10 In and in 1

2

The sequence in and in is occasionally used to describe continued concrete inward motion as 3

in (27). This use was only found in the American material. 4

5

(27) But it’s very slow-going, as we just saw, as this storm is closing in and in. (COCA: 6

Spoken; 2005) 7

8

As noted by the OED, however, in and in has also developed a specialized meaning in the 9

phrases to breed in and in “to breed always within a limited stock” and to marry in and in, “to 10

marry with near relatives, in successive generations” (OED s.v. in and in). An example of a 11

concrete use of this specialized meaning is given in (28). 12

13

(28) If a farmer persists in what is called “breeding in and in,” that is, from the same stock 14

without changing the blood, it is well known that a rapid degeneracy is the inevitable 15

consequence. (COHA; Non-Fiction; 1853) 16

17

In and in is thus another example of layering where the intensified directional meaning of the 18

sequence coexists with a metaphorical one. 19

20

4. Discussion 21

22

4.1 Semantic-pragmatic processes 23 24 25 The formulaic sequences in the present study are undergoing a number of semantic-pragmatic 26

processes, which will be dealt with here under the headings metaphorization; subjectification; 27

development of textual or discourse functions; and grammaticalization and lexicalization. 28

Finally, iconicity as an explanatory factor will be discussed. 29

30

4.1.1 Metaphorization 31

32

Metaphorization is reflected in the tendency for the meanings of these sequences to move 33

from expressing concrete meanings to expressing more abstract ones. This is seen most 34

clearly in our corpus material with on and on and over and over, but is also evidenced in the 35

Page 22: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

21

OED for by and by and through and through. On and on, which in the mid-19th century 1

mainly expressed a physical forward motion, in Present-day English mostly refers to the 2

abstract extension of events. Similarly, over and over has shifted from being associated with 3

physical motion described by verbs like TURN and ROLL to being connected with any repeated 4

activity. On and on and over and over have thus moved from referring to movement in space 5

to referring to continuous and repeated activity, respectively. By and by and through and 6

through have also followed a path towards more abstract meaning. By and by originally 7

referred to a succession of people or things, and then extended to a succession of events. 8

Through and through referred to physical extension (like stab him through and through in 9

(19) above) from the outset, but is now often used with abstract extension. The material thus 10

provides typical examples of how conceptual metaphor brings about semantic change 11

(Traugott & Dasher 2005:27–34). On and on, over and over and through and through in some 12

instances still retain their original concrete sense in parallel with their new metaphorical 13

senses. These three formulaic sequences are therefore, like in and in as argued in 3.10, 14

instances of layering of meaning. 15

When phrases undergo metaphorization, they also tend to be affected by 16

subjectification, as will be seen in the next section. 17

18

4.1.2 Subjectification 19

20

Subjectification (Traugott 1995, 2010; Traugott & Dasher 2005) is a tendency for meanings to 21

increasingly become explicitly grounded in the speaker’s or writer’s perspective (Traugott & 22

Dasher 2005:6). Traugott (2010:56–60) considers some of the ways in which subjectification 23

has been operationalized in previous studies, concluding that it seems unlikely that there are 24

defining criteria of subjectification that can be applied across different construction types and 25

languages. As seen above in section 3.4, the expression of negative connotations is the 26

clearest type of subjectification found in the present investigation. In previous studies, an 27

increased use of first-person subjects in sentences with the item that is becoming subjectified 28

has often been taken as a clue to subjectification, but Traugott suggests that such subjects are 29

not necessarily linked to increasing subjectivity. Instead she argues that a shift towards 30

negative evaluation is connected to an increase in third-person subjects. Our material provides 31

some support for this in that there has been a slight, non-significant decrease in the use of 32

first-person subjects with the increasingly negative and subjective on and on (dropping from 33

Page 23: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

22

10% (14/136) in the 1900s to 6% (12/214) in the 2000s). When first-person subjects occur 1

with GO on and on it is sometimes in connection with apologies for misbehaviour, as in (29): 2

3

(29) I’m sorry to have gone on and on about these things. (COHA; Fiction; 2007) 4

5

Another criterion of subjectification mentioned by Traugott is more clearly illustrated with on 6

and on. Traugott (2010:59–60) argues that subjectification often correlates with increasingly 7

peripheral positions in the clause. In the COHA material there is a significant shift (p ≥ 0.05) 8

between the 1900s and 2000s towards clause- or sentence-final position of on and on (1900s 9

46% (63/136); 1950s 54% (90/166); 2000s 61% (130/214)). 10

Subjectification has previously mainly been connected with grammaticalization, but as 11

has been seen in the present study this type of change is also relevant in the lexicalization of 12

formulaic sequences. This is supported by Wray’s (2002:95–100) suggestion that the use of 13

formulaic sequences serves the goal of promoting the speaker’s interests. For example, 14

formulaic sequences aid the speaker’s production by organizing text and can be used to 15

influence the hearer and to manipulate information. The different sequences in this study fulfil 16

a number of subjective functions, as for instance the general extender and on and on (see 17

below) which is used both to structure text and to express attitude towards the items listed, 18

and through and through which expresses a subjective attitude towards the extent of 19

something. The main type of subjectification observed with these formulaic sequences is the 20

association with negative subjective attitudes. 21

22

4.1.3 Development of textual or discourse functions 23

24

As already mentioned, on and on has developed a specialized text-organizing function as a 25

general extender. This occurs when a verbless and on and on follows on, for instance, noun 26

phrases, as in (30), or clauses or sentences, as in (31). 27

28

(30) All the great innovators are pictured there. I mean, you’ve got Dizzy Gillespie and 29

Thelonius Monk and Mary Lou Williams and Marian McPartland and Gerry Mulligan 30

and Gene Krupa and Count Basie and on and on. (COCA; Spoken; 1995) 31

(31) But anyhow, the Emir runs the country. A huge part of the oil production goes into his 32

family’s pocket. He has 70 wives, and on and on and on and on and on. (COCA; 33

Spoken; 1991) 34

Page 24: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

23

1

General extenders “generalize from a preceding referent to the larger group of items to which 2

that referent belongs” (Tagliamonte & Denis 2010:335). This is seen in (30) and (31), where 3

(31) is typical of the and on and on instances in COCA in that it expresses negative attitude 4

towards the items in the list, while (30) belongs to the rarer positive instances.18 5

General extenders (sometimes referred to as vague category identifiers (Channell 1994), 6

coordinating tags (Biber et al. 1999) or referent-final tags (Aijmer 2002)), such as and stuff 7

like that and or something, have received extensive coverage in recent years (e.g., Channell 8

1994:119–156; Biber et al. 1999:115–117; Overstreet 2000; Aijmer 2002:211–249; 9

Tagliamonte & Denis 2010). However, it appears that and on and on has been largely 10

overlooked. General extenders do not only serve as extenders of lists from which listeners can 11

infer other members, but also as interpersonal expressions that express speaker attitude 12

towards the message and the hearer (Overstreet 2000; Aijmer 2002:211–249). The general 13

extender and on and on thus performs two of the more frequent functions that formulaic 14

sequences typically fulfil (see, e.g., Wray 2002:52–55; Granger & Paquot 2008:41–45), since 15

it not only organizes text, but, as seen in (30) and (31), also allows the expression of the 16

speaker’s subjective attitude towards their utterances. As with other general extenders, and on 17

and on can serve several interpersonal functions. For example, in the examples above, longer 18

lists would flout the Gricean maxim of quantity that speakers should not say more than is 19

required (as also noted by Aijmer 2002:232). Furthermore, speakers and writers negotiate 20

common ground with their intended recipients. For instance, in (31) the hearers are expected 21

to share the speaker’s presuppositions about other disagreeable characteristics of an autocratic 22

ruler. In conversation, vagueness and imprecision about values and opinions relies on 23

speakers sharing knowledge and experience. Interpersonal involvement rather than 24

explicitness is important. In academic writing, on the other hand, writers can use a partial list 25

and add a general extender as a hedging device (Biber et al. 1999:1045). The development of 26

and on and on into a multi-functional general extender is thus a prime example of 27

subjectification of meaning. 28

The general extender and on and on occurs in all the genres studied, as seen in Table 2. 29

30

18 A function similar to that of a general extender is performed by the sequence the list GO on and on, as in From Barry Bonds to Floyd Landis, the list goes on and on of athletes who are willing to sell their bodies and their souls for the sake of winning (…) (COHA; News; 2006). There were 14 instances of the phrase the list goes on and on in COHA in the 2000s. The first attestation of this sequence in COHA is from 1952.

Page 25: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

24

Table 2. And on and on (‘and so on’) in COCA 1

Tokens Tokens/M Negative evaluation

Spoken 96 1.1 68% (65)

Fiction 68 0.8 54% (37)

Magazine 66 0.8 58% (38)

Newspaper 59 0.7 59% (35)

Academic 33 0.4 72% (24)

2

As can be seen in Table 2, the expression of negative subjective attitudes towards the items 3

listed is one of the main functions of and on and on. There is a slight tendency for and on and 4

on to be more frequent in informal genres, where subjective attitudes are more often 5

expressed, rather than in the more neutral style of academic writing, where the sequence is 6

also the least frequent. However, there were no significant differences between the genres as 7

regards the expression of negative subjective attitudes. 8

And on and on does not appear to be in real competition with the more established and 9

so on, however, since the latter outnumbers the former by 25 to 1 in COCA. Moreover, while 10

and on and on expresses negative connotations in a majority of the cases (62% (199/322)), 11

and so on is more neutral with only 19 negative out of 100 random examples (this difference 12

is significant (p ≥ 0.05)). There is thus a division of labour between the two expressions. 13

14

4.1.4 Grammaticalization and lexicalization 15

16

There was only one clear instance of grammaticalization in the material, and while it is not 17

from Standard English, it is still noteworthy. The grammaticalization of by and by as a marker 18

of future time in Tok Pisin is a textbook example of a lexical item acquiring a grammatical 19

function. By and by developed into baimbai and finally into bai (Sankoff & Laberge 1974:77–20

79; Verhaar 1995:314–315), as in Nil kapa bai i no ros kwik ‘Copper nails will not rust easily’ 21

(Verhaar 1995:315). The peak in frequency of by and by in our American material coincides 22

with the British colonization of Papua New Guinea in the 1880s. It is therefore likely that it 23

was a fairly frequent adverb also in BrE speech at the time, which would explain why this 24

adverb, which is quite rare in Present-day English, could become grammaticalized in Tok 25

Pisin. 26

Page 26: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

25

More ADV1 and ADV1 sequences have undergone and are undergoing lexicalization. 1

As discussed by Brinton & Traugott (2005:37–38), it has been argued that conversion, i.e. the 2

shift from one category to another, is a type of lexicalization. Lexicalization is probably most 3

evident in those cases where a formulaic sequence changes word class and acquires a new 4

idiosyncratic meaning. Among the ADV1 and ADV1 sequences, this is seen with the noun 5

(on the) up-and-up and the adjectival out-and-out (hoodlums). Such conversion has been 6

attested with other sequences as well, as for instance when the adverb face to face is used as a 7

premodifier (a face-to-face meeting) and a noun (let’s have a face-to-face) (for more 8

examples, see Lindquist & Levin 2009). Hyphenation in writing is frequent in adjectival uses 9

and even more so in nominal ones. 10

However, other instances of lexicalization typically meet a number of other criteria 11

mentioned by Brinton & Traugott (2005). Of these criteria we will discuss univerbation, non-12

compositionality, grammatical deficiency, lack of substitutability and simplification. 13

Lexicalization is frequently thought of as univerbation (Brinton & Traugott 2005:47–14

52). Signs of incipient univerbation can be detected in hyphenated forms, e.g., by-and-by, out-15

and-out used as an adjective and up-and-up as a noun. As will be seen below in the discussion 16

of iconic variation, such lexicalized forms resist multiplication (*an out-and-out-and-out 17

madman). The forms used as adjectives or nouns are all semantically non-compositional (ibid. 18

p. 55), i.e. it is not possible to deduce the meaning from the parts of the sequence. These 19

criteria are interrelated: Non-compositional sequences become stored as single units in the 20

mental lexicon (cf. Wray 2002:128−139) and are often written with hyphens as single units. 21

Because the meanings of the sequences are opaque, there can be no iconic variation. Non-22

compositionality is probably the most pervasive feature of the ADV1 and ADV1 23

construction, since all types except down and down have developed opaque uses. This applies 24

both to more frequent sequences such as on and on and the rare and technical BREED/MARRY 25

in and in, which shows that non-transparent meanings do not only develop with high-26

frequency items. 27

In the material there are two examples of emerging non-compositionality. On and on 28

and over and over are losing their association with motion verbs, and are instead increasingly 29

used metaphorically. While over and over co-occurs with a number of verbs denoting, for 30

instance, communication (SAY, TELL), on and on is increasingly becoming restricted to the 31

idiom GO on and on. Such fixing of collocates is typical for lexicalization, according to 32

Brinton & Traugott (2005:105). GO on and on is thus also an example of formulaic sequences 33

showing an increasing lack of substitutability (Brinton & Traugott 2005:55). There is no and 34

Page 27: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

26

has never existed any lexical substitutability in the ADV1 and ADV1 construction itself in 1

that and cannot be replaced by the synonymous as well as: *she prattled on as well as on. A 2

factor influencing the low degree of variability in formulaic sequences, adduced by Hudson 3

(1998:33–36) (also discussed by Wray 2008:16), is that many such sequences are adverbs, 4

and adverbs are often not subject to variation either as sequences of words (by and large) or 5

as individual words (mostly). 6

It can be argued that the next criterion, grammatical deficiency (Brinton & Traugott 7

2005:55), is reflected in the fact that coordination of two identical adverbs by means of and 8

can be seen as irregular. This holds especially for sequences that are used as adjectives or 9

nouns. The other criteria suggested are either not applicable (e.g., lack of passivization) or not 10

fulfilled (lack of topicalization; on and on it went is unproblematic). 11

Simplification (Brinton & Traugott 2005:54) is illustrated in the ADV1 and ADV1 12

material by the loss of again after over and over. Expressions that are used repeatedly tend to 13

be reduced.19 Economy and clarity are two fundamental motivations for speakers. In the case 14

of the slightly tautological over and over again, it can be argued that the increasingly non-15

transparent over and over can serve the same function as the longer sequence, and that 16

therefore the reduction of this phrase affects the clarity for hearers only minimally. 17

It is noteworthy that the reduction of over and over (again) was the fastest between the 18

1850s and the 1900s, which is the time span when the frequency of over and over increased 19

the most (see Figures 1 and 3 above). Since then the overall frequency of the sequence has 20

been decreasing and the reduction process has been slowing down. It has been shown in 21

numerous studies that frequency plays a role in simplification (cf., e.g., Bybee 2006), and in 22

this case it seems that over and over (again) went through a dynamic period of change that 23

coincided with its rapid increase in frequency. As the sequence was more frequently used, 24

speakers could simplify the form because hearers could be expected to quickly understand a 25

frequent sequence. 26

27

4.1.5 Iconicity 28

29

The origin of the ADV1 and ADV1 construction most likely lies in iconicity: the adverb is 30

repeated in order to express intensity. Such reduplications are common in the world’s 31 19 Reduction of lexicalizing items are mainly exemplified by phonologically reduced forms (e.g., lord < OE hlaf ‘loaf’ + weard ‘guardian’ (Brinton & Traugott 2005:47–52), but in the case of over and over again, a word is elided.

Page 28: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

27

languages and their development can be studied in pidgins and creoles. Reduplications in Tok 1

Pisin, e.g., “convey feelings in regard to the high degree in which the modifier obtains, or to 2

the large quantity of what is modified by the reduplicated form, or both” (Verhaar 1995:292). 3

Thus, for instance, bikpela bikpela means ‘in large quantities, huge, very large’. 4

Occasionally, the intensification is carried further and sequences of three or more 5

identical adverbs occur in the material, as can be seen for some of the sequences in Table 3. 6

7

Table 3. Frequencies of ‘multiple’ tokens with three adverbs in COHA20 8

1850s 1900s 1950s 2000s

again and again

and again

0% (0/207) 0.2% (1/424) 1.9% (6/322) 7.3% (27/369)

on and on and

on

0% (0/19) 3.7% (5/136) 6.6% (11/166) 8.9% (19/214)

over and over

and over

0% (0/138) 0.6% (2/317) 2.7% (9/330) 4.1% (21/518)

9

It is a reasonable assumption that the increase in multiple tokens in this written corpus is an 10

effect of colloquialization of writing styles (Mair 2006:185–187) rather than the emergence of 11

an entirely new way of using these sequences. Informal, more spontaneously produced 12

language where subjective attitudes are more likely to be expressed contains more such 13

tokens. In COCA, the highest ratios for triplicated tokens are consistently found in the Spoken 14

sub-corpus, and the lowest in the Academic sub-corpus. Moreover, in LSAC the multiple 15

forms on and on and on and over and over and over are significantly (p ≥ 0.05) more frequent 16

than in the written material from COHA, 2000s.21 17

Judging from COCA and the BNC, there is a slightly stronger tendency in American 18

English than in British English to use the ‘multiple adverbs’ again and again and again and 19

over and over and over, while the proportions are about equal for on and on and on. 20

Examples are given in (32) – (34). 21

22

(32) He slapped himself on the top of his head, again and again and again. (COHA; Fiction; 23

2000) 24 20 For all three sequences there was one relevant ‘quadruple’ instance found. The increase for again and again was significant (p ≥ 0.05) between the 1950s and 2000s. 21 In LSAC, the frequencies of multiple tokens were as follows: again and again 33% (2/6), on and on 44% (17/39) and over and over 35% (17/49).

Page 29: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

28

(33) The hypnotic, rhythmic pounding of Roosevelt’s garbled voice droned on and on and 1

on. (COHA; Fiction; 2005) 2

(34) Every little thing that’s wrong is pointed out over and over and over. (COHA; 3

Magazine; 2006) 4

5

More idiomatic/non-transparent cases resist triplication. While there are examples of up and 6

up and up and out and out and out in a concrete sense, as in go up and up and up or spread 7

out and out and out, as in (24) above, there are none in the idiomatic senses: *on the up and 8

up and up, *an out-and-out-and-out madman. 9

We will now turn to some general conclusions. 10

11

5. Conclusions 12

13

In this section we will summarize the main findings of the study, answer the questions asked 14

in the introduction, make some general comments about methodology and suggest avenues 15

for further research. 16

The study has shown that in all cases except again and again and by and by, the original 17

concrete, physical meaning is still in use together with the more recent abstract and 18

metaphorical meanings, constituting a typical case of layering. The only case of 19

grammaticalization found was in fact in another language (by and by > bai in Tok Pisin) but 20

there were many sequences that to various degrees fulfil a number of the criteria for 21

lexicalization: univerbation, non-compositionality, grammatical deficiency, lack of 22

substitutability and simplification. 23

The ADV1 and ADV1 sequences behave differently as far as frequency is concerned. A 24

majority peak before the 2000s: by and by and round and round in the 1850s, again and again 25

and through and through in the 1900s and out and out and up and up in the 1950s. There are 26

specific reasons for the increases of the remaining four: around and around and over and over 27

are taking over some or most uses of round and round and again and again (but may 28

themselves be affected by competition from other adverbials in the future). On and on 29

continues its slow increase while narrowing its scope of use to the sequence GO on and on. 30

Down and down, finally, has not developed any non-transparent meanings, is restricted to 31

fiction and remains at quite a low frequency level. 32

A complex picture emerges as regards genre variation and change. The three most 33

frequent sequences (again and again, on and on, over and over) are more common in fiction 34

Page 30: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

29

than in other genres in COHA,22 and fiction appears to be leading the way in the increase of 1

on and on and over and over. Furthermore, fiction is the most advanced genre when it comes 2

to the loss of again after over and over. The simplification of this sequence is thus the most 3

advanced in the genre with the highest token frequency. The loss of again can therefore be 4

shown to be frequency dependent. Further studies are required to analyze why fiction writers 5

display a preference for these sequences and why again is retained to such a high degree after 6

over and over in speech. A more expected trend is that multiple tokens (on and on and on) are 7

more frequent in speech than in writing. Such multiple tokens are iconic in nature, 8

intensifying the meaning of the dual sequences, just like the dual sequences intensify the 9

meaning of the single adverbs. 10

The study thus supports the argument that “ADV1 and ADV1” constitutes a 11

construction or pattern lying behind a set of sequences which share a number of 12

characteristics: a common grammatical form which is fixed but can be multiplied (when the 13

sequence is used in its basic concrete sense) and in almost all cases a propensity for 14

developing metaphorical, abstract meanings along typical paths of change. Some sequences 15

have also developed textual functions and/or subjective meanings. Furthermore, a few have 16

adopted new grammatical roles as nouns or adjectival pre-modifiers of nouns. It is thus 17

reasonable to claim that these sequences make up a family of expressions, each bearing a 18

family resemblance to the others. 19

Methodologically, the study has shown that for this kind of lexical investigation, very 20

large corpora are needed. With the exception of the BNC, however, such large corpora are 21

rarely fully comparable across time periods or regional varieties, which means that 22

triangulation with a number of different corpora and careful evaluation of the results is always 23

necessary. 24

In further research, it would be interesting to investigate the development of alternative 25

ways of expressing the meanings expressed by the ADV1 and ADV1 sequences, and to look 26

closer at the mechanisms behind some of the more spectacular changes (the demise of by and 27

by; the decline of round and round). The role of genre or register in those changes also needs 28

further investigation. 29

30

31

32

22 This generally holds true also in comparisons with spoken data from COCA and LSAC. The only exception is on and on, which is marginally more frequent in COCA Spoken than in COHA Fiction.

Page 31: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

30

References 1

Aijmer, K. 2002. English Discourse Particles: Evidence From a Corpus. Amsterdam: 2

Benjamins. 3

Altenberg, B. 1998. “On the phraseology of spoken English: The evidence of recurrent word-4

combinations. In A. P. Cowie (Ed.), Phraseology. Oxford: Oxford University Press, 5

101−122. 6

Aston, G. & L. Burnard. 1998. The BNC Handbook. Edinburgh: Edinburgh University Press. 7

Baker, P. 2011. “Times may change, but we will always have money: Diachronic variation in 8

recent British English”. Journal of English Linguistics 39 (1), 65–88. 9

Barlow, M. 2011. “Corpus linguistics and theoretical linguistics”. International Journal of 10

Corpus Linguistics 16 (1), 3−44. 11

Biber, D. 2009. “A corpus-driven approach to formulaic language in English. Multi-word 12

patterns in speech and writing”. International Journal of Corpus Linguistics 14 (3), 13

275−311. 14

Biber, D., S. Johansson, G. Leech, S. Conrad & E. Finegan. 1999. Longman Grammar of 15

Spoken and Written English. Harlow: Longman. 16

Brinton, L. J. & E. C. Traugott. 2005. Lexicalization and Language Change. Cambridge: 17

Cambridge University Press. 18

Bybee, J. L. 2006. “From usage to grammar: The mind’s response to repetition.” Language 82 19

(4), 711−733. 20

Channell, J. 1994. Vague Language. Oxford: Oxford University Press. 21

Davies, M. 2008–. The Corpus of Contemporary American English (COCA): 425 million 22

words, 1990–present. Available online at http://www.americancorpus.org. 23

Davies, M. 2009. “The 385+ million word Corpus of Contemporary American English (1990–24

2008+). Design, architecture, and linguistic insights”. International Journal of Corpus 25

Linguistics 14 (2), 159–190. 26

Davies, M. 2010. “The Corpus of Contemporary American English as the first reliable 27

monitor corpus of English”. Literary and Linguistic Computing 25 (4), 447–464. 28

Davies, M. 2010–. The Corpus of Historical American English (COHA): 400+ million words, 29

1810–2009. Available online at http://corpus.byu.edu/coha. 30

Davies, M. 2011–. Google Books (American English) Corpus (155 billion words, 1810–31

2009). Available online at http://googlebooks.byu.edu/. 32

Page 32: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

31

Ellis, N. C. 2008. “Phraseology: The periphery and the heart of language”. In F. Meunier & S. 1

Granger (Eds.), Phraseology in Foreign Language Learning and Teaching, Amsterdam: 2

Benjamins, 1–13. 3

Granger, S. & F. Meunier. 2008. Phraseology. An Interdisciplinary Perspective. Amsterdam: 4

Benjamins. 5

Granger, S. & M. Paquot. 2008. “Disentangling the phraseological web”. In S. Granger & F. 6

Meunier (Eds.), Phraseology. An Interdisciplinary Perspective. Amsterdam: Benjamins, 7

27–49. 8

Gries, St. Th. 2008. “Phraseology and linguistic theory: A brief survey” In S. Granger & F. 9

Meunier (Eds.), Phraseology. An Interdisciplinary Perspective. Amsterdam: Benjamins, 10

3–25. 11

Hudson, J. 1998. Perspectives on Fixedness: Applied and Theoretical. Lund: Lund University 12

Press. 13

Jackendoff, R. 2008. “Construction after construction and its theoretical challenges”. 14

Language 84 (1), 8–28. 15

Leech, G., M. Hundt, C. Mair & N. Smith. 2009. Change in Contemporary English. A 16

Grammatical Study. Cambridge: Cambridge University Press. 17

Lindquist, H. & M. Levin. 2009. “The grammatical properties of recurrent phrases with body 18

part nouns: The N1 to N1 pattern”. In U. Römer & R. Schulze (Eds.). Exploring the 19

Lexis-Grammar Interface. Amsterdam: Benjamins, 171–188. 20

Mair, C. 2006. Twentieth-Century English. History, Variation and Standardization. 21

Cambridge: Cambridge University Press. 22

Moon, R. 1998. Fixed Expressions and Idioms in English. A Corpus-Based Approach. 23

Oxford: Oxford University Press. 24

Overstreet, M. 2000. Whales, Candlelight, and Stuff Like That: General Extenders in English 25

Discourse. Oxford: Oxford University Press. 26

Oxford English Dictionary (OED) on line. Available at http://www.oed.com 27

Pawley, A. & F. H. Syder. 1983. “Two puzzles for linguistic theory”. In J. C. Richards & R. 28

W. Schmidt (Eds.), Language and Communication, London: Longman, 191–226. 29

Sankoff, G. & S. Laberge. 1974. “On the acquisition of native speakers by a language”. In D. 30

DeCamp, & I. Hancock (Eds.), Pidgins and Creoles. Georgetown: Georgetown 31

University Press, 73–84. 32

Sinclair, J. 1998. “The lexical item”. In E. Weigand (Ed.), Contrastive Lexical Semantics. 33

Amsterdam: Benjamins, 1−24. 34

Page 33: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

32

Sinclair, J. 1999. “A way with common words”. In H. Hasselgård & S. Oksefjell (Eds.), Out 1

of Corpora. Amsterdam: Rodopi, 157–79. 2

Stubbs, M. 2002. “Two quantitative methods of studying phraseology in English”. 3

International Journal of Corpus Linguistics, 7 (2), 215–44. 4

Stubbs, M. 2007a. “On very frequent phraseology in English: Structures, distributions and 5

functions. In R. Facchinetti (Ed.), Corpus Linguistics Twenty-five Years on. Amsterdam: 6

Rodopi, 89–105. 7

Stubbs, M. 2007b. “Quantitative data on multi-word sequences in English: The case of the 8

word ‘world’. In M. Hoey, M. Mahlberg, M. Stubbs & W. Teubert, Text, Discourse and 9

Corpora. London: Continuum, 163–189. 10

Tagliamonte, S. & D. Denis. 2010. “The stuff of change: General extenders in Toronto, 11

Canada”. Journal of English Linguistics. 38 (4): 335–368. 12

Traugott, E. C. 1995. “Subjectification in grammaticalization”. In D. Stein & S. Wright, 13

(Eds.). Subjectivity and Subjectivisation. Cambridge: Cambridge University Press, 31–14

54. 15

Traugott, E. C. 2010. “(Inter)subjectivity and (inter)subjectification: A reassessment”. In: K. 16

Davidse, , L. Vandelanotte & H. Cuyckens (Eds.), Subjectification, Intersubjectification 17

and Grammaticalization. Berlin: Mouton, 29–71. 18

Traugott, E. C. & R. B. Dasher. 2005. Regularity in Semantic Change. Cambridge: 19

Cambridge University Press. 20

Verhaar, J. W. M. 1995. Toward a Reference Grammar of Tok Pisin: An Experiment in 21

Corpus Linguistics. Honolulu: University of Hawai’i Press. 22

Wray, A. 2002. Formulaic Language and the Lexicon. Cambridge: Cambridge University 23

Press. 24

Wray, A. 2008. Formulaic Language: Pushing the Boundaries. Oxford: Oxford University 25

Press. 26

27

28

Page 34: ADV1 and ADV1revised - diva-portal.org1398038/FULLTEXT01.pdf · Corpus methodology has made new types of empirical 9 investigations possible and has moved the focus from infrequent

33

1

2