skinner's verbal behavior, chomsky's review and mentalism

JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR

SKINNER'S VERBAL BEHAVIOR,CHOMSKY'S REVIEW, AND MENTALISM

NATHAN STEMMERBAR-ILAN UNIVERSITY, ISRAEL

Skinner's Verbal Behavior (1957) is a comprehensive treatise that deals with most aspects of verbalbehavior. However, its treatment of the learning of grammatical behavior has been challenged re-peatedly (e.g., Chomsky, 1959). The present paper will attempt to show that the learning of grammarand syntax can be dealt with adequately within a behavior-analytic framework. There is no need toadopt mentalist (or cognitivist) positions or to add mentalist elements to behaviorist theories.

Key words: verbal behavior, grammar, B. F. Skinner, N. Chomsky, behaviorism, mentalism, cog-nitivism

Although Skinner's Verbal Behavior (1957)is a comprehensive treatise that deals with mostaspects of verbal behavior, Chomsky's (1959)review of the book was extremely critical; hesuggested that the book demonstrated the fail-ure of behaviorism. The review was very in-fluential, and many psychologists accepted itsconclusions. The effect was an abandonmentof behaviorist principles and the ready accep-tance of mentalist (or cognitivist) approachesamong most of the intellectual community.About 10 years later, MacCorquodale

(1970) published a reply to Chomsky in whichhe answered most of Chomsky's arguments,but the effects of this reply were minimal. Whyhad MacCorquodale's paper so little impact?There are probably various reasons (see, e.g.,Czubaroff, 1988), but one is apparently cru-cial. Although MacCorquodale showed that alarge number of Chomsky's objections are un-justified, his answer to the main objection,namely Chomsky's critique of Skinner's ac-count of the learning of grammatical behavior,was unsatisfactory.

In this paper I will argue that Chomsky'sobjection is invalid. A Skinnerian frameworkcan indeed account for grammatical behavior,and there is no need to adopt mentalist ap-proaches or to add mentalist elements to be-haviorist principles (cf. Killeen, 1984). Beforedealing with grammatical behavior, however,

I am grateful to two anonymous referees of this journalfor many helpful comments on an earlier version of thispaper. Address for correspondence: Nathan Stemmer, De-partment of Philosophy, Bar-Ilan University, Ramat-Gan,Israel.

it is important to examine some basic aspectsof the learning of words.

THE LEARNING OF WORDSAccording to Skinner (1957), children learn

to say words such as red (pp. 84-85), chair(pp. 91-92), pyramid (p. 107), andfamiliar (p.136) with the help of appropriate reinforcingcontingencies. The contingencies establish thecontrol of the responses by certain stimuli. Es-tablishing stimulus control frequently has ge-neric effects; the responses are evoked not onlyby identical but also by similar stimuli, that is,stimuli that share certain properties with theoriginal stimuli (e.g., Skinner, 1953, pp. 132-134; 1957, pp. 91-92). I will say that the class-es containing similar stimuli are the organ-ism's generalization classes relative to the orig-inal stimuli (for a more precise definition ofthis term, see Stemmer, 1980, 1983, 1989).The properties that control words such as

red, chair, and pyramid are physical properties,whereas other words, such asfamiliar, are con-trolled by other properties. Thus, Skinner ob-serves that a

familiar place is not anything distinguished byany physical property. It is familiar only tosomeone who has seen it or something like itbefore. Any place becomes familiar when fre-quently seen ... the condition responsible forfamiliar is not in the stimulus but in the historyof the speaker. Having acquired the responsewith respect to this property, the speaker mayemit it in the presence of other objects fre-quently seen. (1957, p. 136)

The word familiar is thus controlled by a non-physical property, the property "to be an objectfrequently seen by the speaker." This property

307

1990, 54, 307-315 NUMBER 3 (NOVEMBER)

NATHAN STEMMER

has acquired controlling power because of cer-tain events in the speaker's history.

Skinner does not discuss in detail the his-torical (ontogenic) contingencies that give con-trolling efficacy to nonphysical properties, butthey are investigated in experiments on stim-ulus equivalence (see, e.g., Dixon & Spradlin,1976; Sidman & Tailby, 1982; Sidman, Wil-son-Morris, & Kirk, 1986). The contingenciesthat have this effect in Pavlovian conditioningare described by Quine (1974) and Stemmer(1971, 1973, 1980, 1983). I will use the termfunctional properties to refer to nonphysicalproperties that have acquired controlling ef-ficacy.

Functional properties play an important rolein verbal behavior, because they control suchresponses as familiar, toy, clothes, tool, and soforth. However, it is important to realize thatattributing controlling efficacy to a functionalproperty must always be justified. Because theefficacy derives from specific contingencies, onemust always be able to show that the speaker'shistory indeed includes the relevant contin-gencies.

It often happens that the property that con-trols a child's verbal response is not identicalto the property that controls the response inthe verbal community. In particular, the child'scontrolling property may determine a gener-alization class that is wider than the class de-termined by the community's controlling prop-erty. According to Skinner, in this case thecommunity resorts "to another behavioral pro-cess which sharpens stimulus control and op-poses the process of extension. It reinforcesresponses in the presence of a chosen stimulusproperty and fails to reinforce, or perhaps evenpunishes, responses evoked by unspecifiedproperties" (1957, p. 107).

Punishment or nonreinforcement does notimmediately establish the correct controllingproperty. Because stimuli usually have severalproperties, sharpening the control of a partic-ular property may take several training ses-sions. During these sessions the organism triesout different discriminative properties until itarrives at the correct property or at the correctcombination of properties (e.g., Skinner, 1953,pp. 134-136; 1957, pp. 107-108, 117-119).As is suggested by experiments on attentionand cue-distinctiveness (e.g., Skinner, 1953,pp. 122-124; Mackintosh, 1974; Sutherland& Mackintosh, 1971), the more salient a prop-

erty is for the organism, the higher the prob-ability that the property will be tried out bythe organism. (Actually, it is the other wayaround. We call salient those properties thathave a high probability for becoming discrim-inative properties [cf. Skinner, 1953, p. 122].Still, the notion is useful because it is possibleto determine standards of salience [for an or-ganism or for a species] on the basis of inde-pendent experiments on discrimination [cf.Quine, 1974, pp. 25-28; Stemmer, 1983, pp.33-34].)

In addition to learning to say words, chil-dren also learn to become effective listeners.According to Skinner, the learning processesare basically Pavlovian conditioning processes.These give the listener the ability to react tothe verbal stimulus "with conditioned reflexes... or by taking action appropriate to a givenstate of affairs" (1957, p. 357). For example,a child can become an effective listener withrespect to the word Jones-plug

by watching someone working with electricalapparatus while describing his own behavioras he does so.... The effect upon the listeneris not only to establish Jones-plug as an appro-priate tact but to set up nonverbal behavior inresponse to similar stimuli, for example, be-having correctly when asked Please hand me aJones-plug. (1957, p. 360)

The contingency described here by Skinneris a typical Pavlovian contingency. The subjectis exposed to the pairing of two stimuli: theverbal stimulus Jones-plug (within some verbalcontext) and the concrete Jones-plug. As ispointed out by Skinner, such contingenciesusually have two effects. First, they set upappropriate nonverbal behavior in response tosimilar verbal stimuli, for example, to furtherutterances of the word Jones-plug (perhapswithin an appropriate verbal context such asin Hand me a Jones-plug). That is, the personbecomes an effective listener with respect tothe verbal stimuli. Second, the contingenciesestablish the verbal stimuli as appropriate tacts.The effective listener also becomes an effectivespeaker with respect to the verbal stimuli. Pav-lovian contingencies that have these effects willbe called ostensive contingencies (cf. Skinner'suse of the term ostensive definition, 1957, p.360).

Stimulus control can also be sharpened fora listener. The child who points to a horse

SKINNER AND CHOMSKY

when being asked Show me a dog will usuallynot receive reinforcement or may even be pun-ished. (For a brief comment on punishment inPavlovian conditioning, see Skinner, 1957, p.108.)

Except for a few words, such as Daddy orMommy, children do not appear to utter thewords of natural languages spontaneously.Most words are probably first learned by chil-dren as listeners. It is likely, however, that thesharpening of the stimulus control often takesplace when the children perform as speakers.It gives the community more opportunities tocorrect them. (There are, however, cases ofchildren who have become listeners withoutbecoming speakers [see, e.g., Lenneberg, 1962].This, together with the fact that most wordsare probably first learned by children as lis-teners, suggests that the study of the processesby which children become effective listeners isof the greatest importance for a correct analysisof verbal behavior.)

CONTEXTUAL LEARNINGI mentioned above that words controlled by

salient properties are easier to learn than wordscontrolled by "weak" properties. Salience canbe enhanced. Consider, for example, the wordholds. It is unlikely that hearing isolated ut-terances of holds while a holding relation isexemplified (e.g., the holding of a ball by aperson) will establish a holding feature as acontrolling property of the word. The featureis insufficiently salient. Nevertheless, an ap-propriate verbal context may heighten its sa-lience. Suppose a girl has already learned thewords Mommy and the ball, but has neverheard the word holds (or other variants of theverb to hold). The girl is now looking for herball (e.g., she emits Ball?), and her father re-sponds Mommy holds the ball. Hearing thisutterance will direct the girl's attention to theobjects named by Mommy and the ball-themother and the ball-and consequently to thespecific relationship in which the two objectsstand: the physical state of affairs that we call"holding" in our everyday language. The ver-bal context will give sufficient salience to thisrelation. Therefore, the ostensive contingency(or a number of such contingencies) will trans-form the girl into an effective listener withrespect to the structure (or pattern) x holds y:The structure will be controlled by holding

relations. This concept of structure corre-sponds to what Skinner calls functional unity(1957, pp. 119-120) or partial frame (1957,p. 336).

This account of the learning of structuressuch as x holds y is an hypothesis; hence, itcannot be shown to be true. It has, however,a high degree of plausibility, because (a) it isbased on reasonable extrapolations from ex-perimental results; (b) the behavioral effectsare attributed to conditioning processes ratherthan to "operations of the mind"; and (c) nomentalist assumptions are made, because alltheoretical notions that have been used in theaccount have been defined operationally. (Mostof the definitions are adapted from Skinner,1953, 1957.) Moreover, there seem to be noalternative hypotheses that account for thelearning of expressions such as holds, receives,or is smaller than that are more plausible. Thissuggests that the above account, which agreeswith Skinnerian views, is sufficiently plausibleto be accepted.

Let us now examine the learning of otherwords that are controlled by weak properties,namely, words such as which or who. We willsee that these words frequently play an im-portant role in grammatical behavior. I willconcentrate on the word which.

Suppose that a girl has never heard utter-ances of which, but already understands theother words that occur in Please, give me thebook which is on the piano. I am using here theword understands in the sense defined by Skin-ner (1957, p. 277) according to which the "lis-tener can be said to understand [a word] if hesimply behaves in an appropriate fashion"[with respect to utterances of the word]. Thegirl's father now says, Please, give me the bookwhich is on the piano, and there is indeed abook on the piano. This contingency (or severalof such contingencies) will transform the childinto an effective listener with respect to thestructure x which y, because the verbal contextcalls the girl's attention to the particular re-lation that is being expressed by this structure.In other words, with the help of the verbalcontext, the contingency brings the structureunder the control of a subtle property of theenvironment. This account is again merely anhypothesis, but because it has the featuresmentioned in connection with the learning ofholds, we can conclude that it, too, is suffi-ciently plausible to be accepted.

309

NATHAN STEMMER

GRAMMATICALGENERALIZATIONS

Let us now turn to Chomsky's argumentagainst behaviorist theories in general andSkinner's theory in particular. I will state herethe essential part of the argument, while alsotaking into account the versions of the argu-ment given in Chomsky (1957, 1965, 1968,1975, 1986). Consider so-called passive sen-

tences such as The book is held by Mommy or

The lecturer is interrupted by John. Speakersoften emit verbal responses consisting of newpassive sentences (i.e., passive sentences thatthey have not previously heard). Because theseresponses are new, they have not been rein-forced previously. Similar facts hold for lis-teners; effective listeners can understand pas-sive sentences even if they have never heardthem. Analogous conclusions hold for othersentence forms such as questions or negatives.People emit interrogative responses on the ap-propriate occasions, yet these questions are of-ten new in the sense that they were not rein-forced previously. Similar facts hold forlisteners.

Chomsky's argument now can be expressedin the following way. He assumes that the"creative" capacity to understand and producenew sentences derives from the grammar thatwe have internalized. "'We understand a new

sentence, in part, because we are somehowcapable of determining the process by whichthis sentence is derived in this grammar"(Chomsky, 1959, p. 56). Now Chomsky (1957)had shown that grammars consist of very com-plex structures, and he believes that the ac-

quisition of a structure-dependent grammarcannot be explained as the result of a processof generalization of the type studied by Skinner(or other behaviorists or so-called empiricists).He concludes therefore that Skinner cannotaccount for our creative verbal capacity andthat the capacity seems to be largely geneticallydetermined. "The fact that all normal childrenacquire essentially comparable grammars ofgreat complexity with remarkable rapiditysuggests that human beings are somehow spe-cially [innately] designed to do this" (Chom-sky, 1959, p. 57).

Skinner does not deal explicitly with theprocesses by which people acquire this "cre-ative" power, but his analysis of the learningof grammar (1957, pp. 331-334) suggests thathe indeed attributes it to a process of gener-

alization. MacCorquodale explicitly attributesthe learning of grammatical behavior to thechild's ability "to make complex abstractionsand to generalize from them to diverse newinstances" (1970, p. 93), and he mentions inthis connection the processes of stimulus gen-eralization and response induction.

Generalizations are always under the con-trol of a particular property or combination ofproperties, however (e.g., Skinner, 1953, pp.132-136; 1957, pp. 107-111), so we now ar-rive at a fundamental question: What are theseproperties, and how do they acquire their con-trolling power? Concentrating on the "pas-sive" ability, our question is thus: What is theproperty that controls the generalizations thatenable us to respond with new passive sen-tences, and what are the ontogenic contingen-cies that establish its controlling power for ef-fective listeners and speakers?

Although this is a crucial question, neitherSkinner nor MacCorquodale explicitly dealwith it. Skinner attributes grammatical be-havior mainly to autoclitic learning. He states,for example, that "the grouping and orderingof responses is . . . autoclitic" (1957, p. 332),but it is difficult to extract from his brief anal-ysis a method for arriving at the controllingproperties of grammatical generalizations andat the required contingencies.

MacCorquodale discusses the ability to sortthe sentences one hears "into classes or subsetshaving some property in common and differingfrom other subsets in some property" (1970,p. 94). This ability apparently makes it pos-sible to distinguish passive sentences from ac-tive sentences, from questions, from nonsen-tences, and so forth. MacCorquodale speaksonly of some property that controls these classesand subsets, however, without mentioning itsnature and without specifying the contingen-cies that make it a controlling property. Thisaccount does not satisfy Chomsky and his fol-lowers, who feel it is too vague and too sketchy.Perhaps this is one of the main reasons thatcognitive psychologists have largely ignoredMacCorquodale's article.What are we to say about MacCorquodale's

account? We would probably agree that thereis little value to an explanation that attributesgeneralizations such as those from (a) The lec-turer is interrupted by John to (b) The book isheld by the man who holds thepainting to controlby an unspecified property. Still, it has animportant quality; it is based on an extrapo-

310

SKINNER AND CHOMSKY

lation from an experimentally based theoret-ical framework. Moreover, Chomsky's alter-native nativist hypothesis is not very attractive.Because innate capacities are free for the ask-ing, it is too easy. Instead, sound methodology(as well as evolutionary considerations) sug-gest that, before postulating an innate mech-anism to explain the creative capacity, a se-rious effort should first be made to explain itin terms of experimentally established learningcapacities (cf. Stemmer, 1987a, pp. 97-100).

MacCorquodale's observations, as well asSkinner's discussion of grammatical behavior,presumably point to a research program; thisprogram should eventually enable us to specify(a) the generalization processes that producegrammatical behavior, (b) the contingenciesthat initiate the processes, (c) the propertiesthat control the generalizations, and (d) thecontingencies that give the properties theircontrolling power. I have started with such aresearch program in earlier publications (es-pecially in Stemmer, 1971, 1973, 1981, 1987a,1987b), and in the following section I will statethe main conclusions of that program.

STRUCTURE-DEPENDENTGRAMMARS

Let us return to the learning of the wordholds, and for simplicity let us concentrate onthe listener. Previously we have seen that chil-dren can learn the structure x holds y by un-dergoing appropriate ostensive contingencies.Expressions such as holds will be called(bi) relational terms, and the values of the var-iables x and y, the first and second arguments.Thus the following sentences contain the re-lational term holds accompanied by different(first and second) arguments: Mommy holds theball, The man holds the book, and The man whoholds the painting holds the book. We notice thatthe nature of the arguments is determined bythe nature of the relational word, or, moreexactly, by the environmental property thatcontrols the word holds. The first argumentsare expressions denoting entities that can holdsomething (i.e., responses that are controlledby such entities), the second arguments areexpressions that denote entities that can beheld, and the entities denoted by the secondargument are those held by the entities thatare denoted by the first arguments.The property that controls each type of ar-

gument is not physical. The expressionsMommy and the man share no (psychologicallysignificant) physical property. Rather, it is afunctional property, and its controlling power(with respect to being an argument of holds)is established in the contingencies in which thechildren have learned the expressions and theword holds. In other words, the contingenciesestablish the classes containing the argumentsof holds as valid generalization classes for thechildren.

It is important to realize that each relationalterm determines the specific nature of its ar-guments. The arguments for holds are differentfrom those of receives, buys, or is bigger than.Their nature is determined by the nature of (a)the contingencies that gave the relational termstheir "meanings" (i.e., the contingencies that,for each relational term, conferred controllingpower to a particular environmental propertyor properties) and (b) the contingencies thatgave the arguments their "meanings."

Arguments, too, can be structured. For ex-ample, the clause the man who holds the paint-ing is the first argument in (c) The man whoholds the painting holds the book. This clauseis a structure, and its nature is determined bythe relational word who, which determines thestructure x who y. In our example, the firstargument of who is the man and its secondargument is holds the painting. The nature ofthese arguments is determined by the above-mentioned contingencies, in particular, thecontingencies that establish a particular prop-erty of the verbal and nonverbal environmentas the controlling feature of x who y.

Substructures, as well as their elements, of-ten have properties that are determined by"higher" structures. For example, the struc-ture of (c) The man who holds the painting holdsthe book determines that the second occurrenceof holds has the status of being the main re-lational term of (c), whereas the first occur-rence, which occurs in a substructure, has thestatus of a secondary relational term.The learning of relational words such as

who or which has often been neglected. It isclear, however, that these words have specificmeanings, which implies that they must belearned in specific contingencies, usually con-textual contingencies. Moreover, because oftheir particular relational character, the wordsplay a prominent role in the formation of sub-structures. It is therefore very important to givea correct account of the processes and contin-

311

NATHAN STEMMER

gencies by which such relational words arelearned (cf. Skinner, 1957, pp. 347-348).We will now see that grammatical gener-

alizations are based on structural properties.I will begin with the semantic relation betweenactive and passive sentences. Suppose that ayoung girl, who has never heard a passive sen-tence, has already learned to understand thesentence (d) Daddy receives the book. That is,appropriate contingencies have established theproperty of Daddy receiving a book as a con-trolling property of (d). Among these contin-gencies are those by which she learned thestructure x receives y. The girl now sees herfather receiving a book and she hears this eventbeing described with the new sentence (e) Thebook is received by Daddy. This contingency (orvarious contingencies of this type) establishesa stimulus-controlled generalization: from sen-tences that have the structure x receives y tosentences having the structure y is received byx, where x and y are the first and second ar-guments of the relational word receives. Thatis, the environmental properties that controlresponses of the first type of sentence now alsocontrol those of the second type.

This generalization gives the girl a creativecapacity. If she is able to emit, for instance,The man who holds the book receives a ball, shewill also be able to emit The ball is received bythe man who holds the book. Both responses arecontrolled by the same property. We noticethat, in this example, not only the sentence butalso the argument x is structured.

Moreover, if the girl has already learnedother relational terms, then this contingency(or various contingencies of this type) also es-tablishes a wider generalization: from sen-tences of the structurexRy to those of the struc-ture y is R-ed by x, where R is a relationalterm and x and y are the first and secondarguments of R. Each pair of responses is con-trolled by the same property. This wider gen-eralization is highly important, because it isno longer restricted to the specific relationalword receives. Rather, it covers relational wordsin general.The wide generalization is effective for the

children who have learned the meanings of therelational words; that is, children who under-went contingencies that establish the controlof the words by certain environmental stimuli.For these contingencies also have a second ef-fect. They give controlling efficacy to a prop-erty that is shared by the words, namely, the

functional property "to be a relational term."This property can therefore play its controllingrole in the wider generalization. Actually, it isnot sufficient for R to be a relational term; itmust have additional features, in particularthose of verbs (Stemmer, 1987a, pp. 105-107).However, these features, too, are controllingproperties established by contingencies.Our conclusions now account for the passive

capacity. If children are able to emit an activesentence, then the wide generalization also en-ables them to emit the corresponding passivesentence. Similar generalizations allow thechildren to transform the active sentences intoother types of sentences such as interrogativeor negative sentences.What about the active sentences themselves?

The structures of these sentences are usuallylearned individually. They are learned when,with the help of appropriate contingencies, thechildren learn the meanings of the relevantrelational terms. This learning includes thelearning of the corresponding structures, as inx holds y. Children do undergo these contin-gencies, because everyone admits that childrenmust learn the meaning of every lexical rootindividually. (Actually, some roots may belearned when they occur in a passive sentence,in a question, etc. In these specific cases, thegeneralizations go from [say] passive sentencesto corresponding active or interrogative sen-tences.) Consequently, we can account for chil-dren's capacity to emit active sentences as wellas corresponding passive, interrogative, neg-ative, or imperative sentences. The former areindividually learned, whereas the latter are theproduct of appropriate generalizations. Thisaccount therefore radically differs from the onegiven by semanticists (e.g., Bowerman, 1976;Braine & Hardy, 1982; Schlesinger, 1982) whoattribute the capacity to emit active sentencesto generalizations rather than to individuallearning. These generalizations are based onproperties such as actor or action; however, theintrinsic defects of this account are discussedin Stemmer (1973, pp. 125-127, 1987a, pp.114-117).

It is important to realize that the propertiesthat control the generalizations we have dis-cussed are not abstract. They are functionalproperties that control concrete stimuli. Forexample, "to be a relational term" is a con-tingently established functional property thatcontrols concrete relational terms (i.e., the spe-cific terms that occurred in the relevant con-

312

SKINNER AND CHOMSKY

tingencies). Similarly, the structures we havediscussed have no special status. They derivefrom the relational character of the relevantterms (i.e., from the fact that relational termsalways have arguments). Finally, the classescontaining the arguments of relational wordsare valid generalization classes. They are de-termined by contingently established func-tional properties.

There are many more properties that con-trol our grammatical generalizations, such asbeing a verb, a noun, or an adjective. It canbe shown that these properties, too, acquiretheir controlling power through appropriatecontingencies (see, e.g., Stemmer, 1971, 1973,pp. 70-71, 1987a, pp. 105-106. Psycholin-guists often use the term distributional analysisto refer to such contingencies and their effects(e.g., Maratsos, 1982).The account I have given here of the learn-

ing of grammatical behavior is of course merelyan hypothesis. However, it has all the featuresmentioned in connection with the learning ofholds. Note in particular that children mustundergo the contingencies that, according tothe hypothesis, confer controlling properties tocertain properties. Thus, children who are ableto emit The ball is received by the man whoholds the book on appropriate occasions musthave undergone the contingencies by whichthey learned relational words, in particular,the word who and the relational roots of toreceive and to hold. We recall that such con-tingencies also give controlling efficacy to thefunctional property "to be a relational term."Moreover, they must have heard at least oncea passive sentence and noticed its relation to acorresponding active one (or vice versa). Wecan therefore conclude that this hypothesis,which agrees with Skinnerian views, is a plau-sible one.The account is not only plausible, but it is

more plausible than the main alternative hy-potheses that are being discussed nowadays:the nativist hypotheses proposed by Chomskyand others, and the semanticist hypotheses. Ihave already mentioned that the latter haveintrinsic defects, and in Stemmer (1987a,1987b) I discuss the shortcomings of the for-mer. Let me mention here one of these short-comings (Stemmer, 1987b, p. 142). Nativistsassume that syntactic categories such as nounphrase or verb phrase are somehow innate. Byclassifying expressions into these categories,children can then acquire a structure-depen-

dent grammar. But to assume that the cate-gories are innate is not sufficient. One mustalso explain the children's capacity to classifyexpressions-concrete items such as the manwho holds-into the categories. Because nativ-ists surely admit that this capacity is not innate,they must explain how it is acquired. Do chil-dren learn this, and if so, what are the learningprocesses (e.g., conditioning processes) andwhat are the contingencies that are requiredfor this? A perusal of the literature shows thatnativists have ignored these issues completely.Clearly, as long as these fundamental ques-tions have not been answered, one cannot saythat there exists a nativist theory of gram-matical competence or grammatical behavior.Chomsky (1975, pp. 31-33) discusses an

example that shows that children's grammat-ical generalizations are structure dependent.Because Chomsky believes that behaviorist (andempiricist) hypotheses cannot account forstructure-dependent generalizations (e.g.,1975, pp. 178-204), the example is offered asproof that these theories are false. We will nowsee that Chomsky is mistaken. (I have madesome minor changes in the example.)

Suppose a child has learned to form ques-tions by hearing the pairs The man is tall-Isthe man tall? and The book is on the table-Isthe book on the table? Chomsky believes thatbehaviorist (and empiricist) theories attributethe child's generalization from these examplesto the following controlling property: the firstoccurrence of the word is of the active sentenceis transferred to the beginning of the corre-sponding question. However, Chomsky ob-serves, this controlling property, which is notstructure dependent, would transform the ac-tive sentence (f) The man who is tall is in theroom into (g) Is the man who tall is in the room?Yet, "children make many mistakes in lan-guage learning but never mistakes such as ex-emplified in [g]" (Chomsky, 1975, p. 31).Rather, they unerringly form the correct ques-tion (h) Is the man who is tall in the room? (ifthey can handle the example). This proves thatthe children's generalization is based on astructure-dependent controlling property,which requires analyzing sentences into phrasesthat "are abstract in the sense that neither theirboundaries nor their categories (noun phrase,verb phrase, etc.) need to be physically marked"(1975, p. 32). Chomsky's conclusion is that thechild's mind "contains the instruction: Con-struct a structure-dependent rule, ignoring all

313

314 NATHAN STEMMER

structure-independent rules. The principle ofstructure-dependence is not learned, but formspart of the [innate] conditions of languagelearning" (1975, pp. 32-33).We have seen, however, that there is no need

to adopt this nativist conclusion. When chil-dren learn is, they do not learn the isolatedword but the structure x is y. And this struc-ture, rather than innate factors, determines thestructure of (e). In particular, the second oc-currence of is is the main relational word andthe first occurrence is secondary; it occurs inthe substructure the man who is tall. Now whenchildren, by generalizing from appropriate in-stances, learn to transform active sentences intoquestions, the generalization will be con-trolled by the functional property "to be arelational term" (among others). Therefore,the generalization does not simply transformcertain word sequences into other word se-quences but rather certain structures into otherstructures. To be sure, the features and prop-erties that control the generalization are notphysical. But they are not abstract either. Theyare functional properties that control concretebits of verbal behavior, and they receive theircontrolling efficacy in contingencies that chil-dren indeed undergo. (We recall that childrenlearn individually the lexical root of every word,and this includes is.) Consequently, contraryto Chomsky's claim, a Skinnerian hypothesiscan account for the structural property thatcontrols the emission of question (h).

CONCLUSIONSThere are two main reasons why natural

languages have structure-dependent gram-mars. First, they all have relational words.Because these words have arguments, they giveorigin to structures and substructures. Second,all these languages permit transformations ofcertain structures into others.

Children learn grammatical behavior withthe help of three types of contingencies (thatmay overlap). The first are the contingenciesby which children learn relational words andthe words that occur in the corresponding ar-guments. The second are the contingencies thatgive controlling efficacy to functional proper-ties, such as the property of being a relationalword or of being an argument of holds. The

third are the contingencies that give origin togrammatical generalizations. These general-izations permit the transformation of the struc-tures determined by the relational words intoother structures, and the controlling propertiesof the generalizations are appropriate func-tional properties. Because the learning pro-cesses that are initiated by the contingenciesare operant or respondent conditioning pro-cesses or related processes such as those thatproduce stimulus equivalence, we can concludethat Skinnerian frameworks account for gram-matical behavior. There is no need to adoptmentalist approaches or to add mentalist as-sumptions to behaviorist theories.

REFERENCES

Bowerman, M. (1976). Semantic factors in the acqui-sition of rules for word use and sentence construction.In D. M. Morehead & A. E. Morehead (Eds.), Normaland defrient child language (pp. 99-179). Baltimore:University Park Press.

Braine, M. D. S., & Hardy, J. A. (1982). On whatcategories there are, why they are, and how they de-velop: An amalgam of a priori considerations, specu-lation, and evidence from children. In E. Wanner &L. R. Gleitman (Eds.), Language acquisition: The stateof the art (pp. 219-239). New York: Cambridge Uni-versity Press.

Chomsky, N. (1957). Syntactic structures. The Hague:Mouton.

Chomsky, N. (1959). Review of Skinner's Verbal Be-havior. Language, 35, 26-58.

Chomsky, N. (1965). Aspects ofthe theory ofsyntax. Cam-bridge, MA: M.I.T. Press.

Chomsky, N. (1968). Language and mind. New York:Harcourt, Brace & World.

Chomsky, N. (1975). Reflectionson language. New York:Pantheon Books.

Chomsky, N. (1986). Knowledge of language: Its nature,origins, and use. New York: Praeger.

Czubaroff, J. (1988). Criticism and response in the Skin-ner controversies. Journal of the Experimental Analysisof Behavior, 49, 321-329.

Dixon, M., & Spradlin, J. (1976). Establishing stimulusequivalences among retarded adolescents. Journal ofEx-perimental Child Psychology, 21, 144-164.

Killeen, P. R. (1984). Emergent behaviorism. Behavior-ism, 12(2), 25-39.

Lenneberg, E. H. (1962). Understanding languagewithout ability to speak: A case report. Journal of Ab-normal and Social Psychology, 65, 419-425.

MacCorquodale, K. (1970). On Chomsky's review ofSkinner's Verbal Behavior. Journal of the ExperimentalAnalysis of Behavior, 13, 83-99.

Mackintosh, N.J. (1974). Thepsychology ofanimal learn-ing. London: Academic Press.

Maratsos, M. (1982). The child's construction of gram-matical categories. In E. Wanner & L. R. Gleitman

SKINNER AND CHOMSKY 315

(Eds.), Language acquisition: The state of the art (pp.240-266). New York: Cambridge University Press.

Quine, W. V. (1974). The roots of reference. La Salle,IL: Open Court.

Schlesinger, I. M. (1982). Steps to language. Hillsdale,NJ: Erlbaum.

Sidman, M., & Tailby, W. (1982). Conditional discrim-ination vs. matching to sample: An expansion of thetesting paradigm. Journal of the Experimental Analysisof Behavior, 37, 5-22.

Sidman, M., Wilson-Morris, M., & Kirk, B. (1986).Matching-to-sample procedures and the developmentof equivalence relations: The role of naming. Analysisand Intervention in Development Disabilities, 6, 1-19.

Skinner, B. F. (1953). Science and human behavior. NewYork: Macmillan.

Skinner, B. F. (1957). Verbal behavior. New York: Ap-pleton-Century-Crofts.

Stemmer, N. (1971). Some aspects of language acqui-sition. In Y. Bar-Hillel (Ed.), Pragmatics of naturallanguages (pp. 194-231). Dordrecht: Reidel.

Stemmer, N. (1973). An empiricist theory of languageacquisition. The Hague: Mouton.

Stemmer, N. (1980). Natural concepts and generaliza-tion classes. Behavior Analyst, 3(2), 41-48.

Stemmer, N. (1981). A note on empiricism and struc-ture-dependence. Journal of Child Language, 8, 649-656.

Stemmer, N. (1983). The roots of knowledge. New York:St. Martin's Press.

Stemmer, N. (1987a). The learning of syntax: An em-piricist approach. First Language, 7, 97-120.

Stemmer, N. (1987b). The learning of syntax: A reply.First Language, 7, 137-144.

Stemmer, N. (1989). The acquisition of the ostensivelexicon: The superiority of empiricist over cognitivetheories. Behaviorism, 17, 41-61.

Sutherland, N. S., & Mackintosh, N. J. (1971). Mech-anisms of animal discrimination learning. New York:Academic Press.

Received March 14, 1990Final acceptance June 10, 1990

skinner's verbal behavior, chomsky's review and mentalism

Documents