coordination, not control, is central to movementcspeech.ucd.ie/fred/docs/cumminscaserta2010.pdf ·...

13
Coordination, not Control, is Central to Movement Fred Cummins School of Computer Science and Informatics University College Dublin [email protected] [email protected] http://pworldrworld.com/fred Abstract. The notion of the control of action is contrasted with that of coordination. In coordinated action, many parts of the body (or bod- ies) come together to act as if they served a specific purpose, recogniz- able as a behavioral goal. Such simpler domains of yoked components are called coordinative structures. Examples are given of the harnessing of components into coordinative structures. In the first case, known as synchronous speech, two speakers are subsumed within a single dyadic domain of organization that exists for as long as the speakers speak in synchrony. In the second case, a time-varying set of articulators work collaboratively in generating natural and fluent movement in accordance with a behavioral goal consisting of a desired utterance. In the latter case, we introduce a new model, extending the venerable task dynamic model familiar to students of articulatory phonology. In the new embodied task dynamic model, precise gestural timing arises, not from computation and control, but from considerations of optimality in movement. A can- didate function for optimization combines terms derived from the esti- mation of articulatory effort, perceptual clarity, and speech rate. Both of these examples illustrate a methodological advantage of dynamical mod- els that demand that the modeler first identify both components and system boundaries as they occur within the context of a specific behav- ioral goal. This contrasts with many approaches within computational cognitive science. Key words: speech production, motor control, coordination, dynamical systems, embodiment, autonomy 1 Introduction In the study of human behavior, and its relation to brain activity, misunder- standings abound. The phrase “motor control” invites a particularly pernicious misreading that is likely to trap the unwary into a cartoon vision of the brain as a master puppeteer, simultaneously sending “control signals” down neural wires, that have “motor action” or behavior, as their end product. This interpretation is a grotesque distortion of the role of the brain. It fundamentally mischarac- terizes its place in the systematic organization of behavior. The puppeteer that

Upload: others

Post on 06-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central toMovement

Fred Cummins

School of Computer Science and InformaticsUniversity College Dublin

[email protected] [email protected]

http://pworldrworld.com/fred

Abstract. The notion of the control of action is contrasted with thatof coordination. In coordinated action, many parts of the body (or bod-ies) come together to act as if they served a specific purpose, recogniz-able as a behavioral goal. Such simpler domains of yoked componentsare called coordinative structures. Examples are given of the harnessingof components into coordinative structures. In the first case, known assynchronous speech, two speakers are subsumed within a single dyadicdomain of organization that exists for as long as the speakers speak insynchrony. In the second case, a time-varying set of articulators workcollaboratively in generating natural and fluent movement in accordancewith a behavioral goal consisting of a desired utterance. In the latter case,we introduce a new model, extending the venerable task dynamic modelfamiliar to students of articulatory phonology. In the new embodied taskdynamic model, precise gestural timing arises, not from computationand control, but from considerations of optimality in movement. A can-didate function for optimization combines terms derived from the esti-mation of articulatory effort, perceptual clarity, and speech rate. Both ofthese examples illustrate a methodological advantage of dynamical mod-els that demand that the modeler first identify both components andsystem boundaries as they occur within the context of a specific behav-ioral goal. This contrasts with many approaches within computationalcognitive science.

Key words: speech production, motor control, coordination, dynamicalsystems, embodiment, autonomy

1 Introduction

In the study of human behavior, and its relation to brain activity, misunder-standings abound. The phrase “motor control” invites a particularly perniciousmisreading that is likely to trap the unwary into a cartoon vision of the brain asa master puppeteer, simultaneously sending “control signals” down neural wires,that have “motor action” or behavior, as their end product. This interpretationis a grotesque distortion of the role of the brain. It fundamentally mischarac-terizes its place in the systematic organization of behavior. The puppeteer that

Page 2: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

2 Fred Cummins

lurks behind a simplistic interpretation of the notion of “control” is a homuncu-lar invention whose mysterious agency appears to be made manifest through theinert machinery of the body.

Unfortunately, something like this myth informs most accounts of overt be-havior, particularly those in which behavior is seen as the output of a system ofthree parts: perception, considered as input, cognition as the substantial middle,and action or overt behavior as the final product. This is the familiar archi-tecture of conventional cognitive psychology, and it has, for many, the statusof orthodoxy. It is perhaps unsurprising that those who regard their inquiry asdirected towards the cognitive heart of this hypothetical system should expendlittle effort in questioning the role of its peripheral twins, perception and action.Yet it seems, to this author, that the study of the origin and form of action haslanguished disproportionately on the fringes of cognitive science (and cognitivepsychology in particular), and that a fundamentally different account of the roleof the brain is becoming available within what we might term a post-cognitivistframework [19, 14, 28, 7, 6].

An alternative to the notion of control in describing movement for goal di-rected behavior is available, and this is coordination [14]. Two brief accounts ofcoordination during speaking will be presented here. Each of them makes use, notof information processing concepts, but of the vocabulary of Dynamic SystemsTheory. The first looks at coordination across multiple individuals, and the sec-ond looks at coordination among a constantly changing set of body parts withinan individual. In each case there is a system that is understood to constitute awell-formed domain in which the constituent parts exhibit lawful interrelations.In neither case is this system a perception-cognition-action unit, and in neithercase is the notion of control of use in teasing out the lawfulness we observe withinthe system. Both bodies of work may be of interest to speech scientists, for whomthe specific tasks involved are of obvious relevance. It is to be hoped that themodeling issues that arise, and the implications of adopting a dynamical per-spective in understanding action will be of interest to a wider set of researcherswithin the emerging post-cognitivist framework.

2 Coordination and Coordinative Structures

As long ago as 1930, the Russian physiologist Nikolai Bernstein observed a cu-rious and telling characteristic of the skilled movements of blacksmiths as theyrepeatedly hit an anvil with a hammer [1, 16] (Fig. 1). He recorded movementat the shoulder, elbow, hand, and at the point of contact between hammer andanvil. Variability from blow to blow was minimized at the point of contact, andnot at any of the biomechanical joints. This seems appropriate for skilled ac-tion, as the behavioral goal that finds expression here is best expressed at thatpoint, while there are many potential configurations of the limb segments thatcan give rise to equally accurate hammer blows. But it raises huge problems forany account of the movement as arising from central control directed from braintowards the periphery. If the brain were issuing control signals, and we make the

Page 3: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 3

further not unreasonable assumption that any biological process is attended bysome non-zero noise level, then noise or error introduced at the shoulder jointwould appear as additive noise at the elbow joint, and error from both shoulderand elbow should appear together with wrist noise at the wrist joint. In short,in a multi-link system, distal errors are predicted to be larger than proximal er-rors. Even if some error correction were possible at the elbow to compensate forshoulder error, it is inconceivable that the point of minimum variation shouldbe the point of contact between hammer and anvil, where direct interventionby the brain is impossible in principle. This argument holds true whether thecontrolled variables are taken to be joint angles, torques, muscle lengths, or anyother candidate.

Fig. 1. Movement variability in this skilled action is minimized at the point wherehammer meets anvil, and is greater at the shoulder, elbow, and wrist joints.

This example illustrates a well-known, but perplexing, characteristic of skilledaction: although the actor is possessed of a hugely flexible biomechanical systemwith an innumerable number of potential degrees of freedom, when this systemis engaged in the pursuit of a specific behavioral goal, it behaves as if it were amuch simpler system, with vastly fewer degrees of freedom, in which all the partswork cooperatively towards the attainment of that goal. Perturbation or error atone point of the system is smoothly and rapidly compensated at another part.The rapidity and specificity of compensation appears to rule out any processingarchitecture that consists of an executive centre that is informed of distal errorsand that computes appropriate compensatory changes to control signals. Forexample, Kelso et al. found that a perturbation to the jaw during production of

Page 4: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

4 Fred Cummins

the final consonant of either /bæz/ or /bæb/ generated compensatory movementthat was specific to the articulatory goal currently constraining movement [15].When the target was /z/, there was an almost immediate response of the tongue,while when the goal was /b/, the response was seen in the movement of the twolips. Crucially, the goal-specific response kicked in approximately 20 ms after theperturbation, which is sufficiently rapid to rule out an account couched in termsof monitoring and error-correcting by an executive controller.

These kind of observations have led to the notion of a coordinative structure,which is a task-specific functional linkage of specific body parts, such that theyact as a unit in the achievement of the behavioral goal [27]. Another term fre-quently employed is synergy [16]. On this view, parts of the body flexibly partakein a series of organizational forms that are defined by the behavioral goals. Wemight speak of the emergence, maintenance, and dissolution of special-purposekicking machines, scratching machines, speaking machines, throwing machines,etc. Skilled action is found to organize the many parts of the body so that theyact with common purpose. The effective complexity of the biomechanical systempartaking in the action is greatly reduced once a clear and practiced behavioralgoal is established.

When body parts become coordinated in this fashion, they constitute a do-main of organization within which the component parts are lawfully related.Change in the state of any one component is not entirely independent of changein the state of any other. Components may themselves be complex entities thatcan be decomposed into sub-constituents that are coordinated to bring about thecomponent-level behavior, and no single level of behavioral description can claimto be privileged. For example, Kelso and colleagues have long studied the formof coordination exhibited when two effectors (fingers, hands, etc) are constrainedto oscillate with a common frequency [14]. Given the behavioral goal providedby the task description, only two forms of stable coordination are observed, onein which the effectors oscillate with common phase (synchrony) and one in whichthey are half a cycle out of step with each other (syncopation). Many featuresof this system, including rate-dependent multi-stability, phase transitions, criti-cal fluctuations, hysteresis, etc, have been modeled using the Haken-Kelso-Bunz(HKB) model [12]. Details of the model are not relevant here, but the structureof the model provides an insight into how dynamical models might approachthe systematic simplification that is evident in skilled action. First, the behaviorof individual components is characterized. In the present case, that amounts todescribing each effector as a self-sustaining oscillator. Then the behavior of thecomponents in the service of the task is described. The lack of independence be-tween the effectors ensures that this system-level description is simpler (of lowerdimension) than a full description of the components and their mutual interac-tions. In the HKB case, the model then provides a formal account of how thehigher dimensional component description collapses to the simpler system-leveldescription.

The system being described here is not a whole person. It is the two effectorsystem, which is considered as a whole—a domain of organization constituted

Page 5: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 5

by the coordinated movement of two components. In developing a dynamicalaccount of observed phenomena, the identification of the domain of the model isan important first step, formalized as the definition of the state of the system.The selection of appropriate state variables allows the modeler to potentiallyidentify lawful domains of organization that transcend the somewhat arbitraryboundaries separating brains, bodies and environments. This is clearly illus-trated in the interpersonal coordination documented in Schmidt, Carello andTurvey (1990). In that study, the task of oscillating two effectors at the samefrequency was distributed across two individuals, seated, each wagging theirlower leg. The same hallmarks of differential pattern stability were found inthis scenario, in that there were two and only two stable patterns of coordina-tion (synchronous/syncopated), there was a greater stability of the synchronouspattern and a strong tendency for the syncopated pattern to transit to the syn-chronous one in a rate-dependent manner. Of course these two individuals arenot obligatorily coordinated. Each is free to stand up and go about their in-dividual lives. But in the context of the behavioral task, they behave (or moreaccurately, the system comprising their two legs behaves) as as a simpler systemwith few degrees of freedom.

3 Synchronous Speech

Something very similar is seen in the coordination displayed by subjects withinthe synchronous speech experimental situation. Synchronous speaking is a be-havioral task in which a pair of subjects are given a novel text, they familiarizethemselves briefly with the text, and then they read the text together, in ap-proximate synchrony, on a cue from the experimenter [8]. It is distinguished fromthe related notion of choral speaking, in that the text is new and is thus notproduced with the exaggerated prosody familiar from the group recitation ofoaths, prayers, etc. Typically, subjects are very good at this task, and withoutany practice, they maintain a tight synchrony with lags (temporal offset betweentheir speech streams) of no more than 40 ms on average [9]. This is comparableto an asynchrony of no more than a single frame of video. Perhaps remarkably,practice does not generate much improvement; the synchronization task appearsto tap into a natural facility for synchronization with another person, despitethe complexity of the task of speaking.

How should we view the sustained exhibition of very tight synchrony be-tween two speakers (Fig. 2)? One way, and that most readily at hand withinmost current approaches, is to view each speaker as an independent system.A speech production system within each individual is held responsible for theplanning and execution of movement. To this picture, we would have to add aperceptual component that monitors the speech being produced by the other,that compares one production with the other, and that makes correspondingadjustments. This unwieldy picture appears obligatory if we treat the brain ascontrolling puppeteer, and if we view the people involved as perception-then-cognition-then-action systems. Construing the synchronous speaking situation

Page 6: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

6 Fred Cummins

Fig. 2. As modelers, we have the freedom to chose to regard a pair of synchronousspeakers as a single system, or as two distinct and interacting systems.

in this way, we would expect to see evidence of drift between speakers and com-pensatory error correction at time lags that would allow for re-planning andalteration of control parameters. As that is not what we observe experimentally,one might instead regard one speaker as providing a lead, to which the otherresponds. This would suggest that somewhat stable performance might be foundwith a fixed lag between speakers. This is never the case. For each speaker, theinherent variability that attends normal unconstrained speaking would be aug-mented by the additional requirement of attempting to match the timing of theco-speaker. Variability would therefore be predicted to increase. It does not. Thedifficulty in adequately describing (modeling) the situation stems from the priorcommitment to the locus of control as lying within an individual.

If we instead view the two speakers as enslaved components within a sin-gle overarching system, each of them both driven and driving the behavior ofthe system as a whole, our expectations would be rather different. Where weknow speech production to be inherently complex and variable, even within thespeech of a single individual, we would expect a simplification, or reduction invariability while the speakers are behaving as components within a superordinatesystem. This is, indeed, what we find [10, 11]. Variability in segment duration,in pitch movement, and in pause behavior are all reliably found to be reducedin synchronous speech, as compared to speech that is not so constrained. If thecomponents of the system are mutually correcting, just as in the coordinatedmovement of body parts within a skilled individual, we would expect no clearleader-follower behavior, and this is, in fact, what we find.

A blacksmith’s arm is perfectly capable of wielding a violin bow, of scratchinga blacksmith’s chest, or of shaking the hand of the fishmonger. But when theblacksmith pursues a well-defined behavioral goal, demanding skilled (and henceunreflective) action, the arm acts as part of an overall domain of organizationthat is brought into existence in pursuit of just that goal, and that ceases toexist once the blacksmith turns to other tasks. So too, in speaking synchronously,each speaker temporarily becomes part of a larger organizational domain, andthe speaker acts as if he were a component within the larger system. Tellingly,

Page 7: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 7

we find that when one speaker makes an error, the usual result is for the entirefragile coordinative system to fall apart, each speaker recovers full autonomy,and the speech of each is immediately and completely decoupled from that ofthe other. The error destroys the boundary conditions imposed by the commonbehavioral goal.

If this account of coordination, within and across individuals, appears counter-intuitive, it is probably because the conviction is rooted very firmly that behindevery lawful intentional action must lurk a controller. How else, we might rea-sonably ask, is one possible movement selected out of many alternatives? How,indeed, are we to account for volition in action and our sense of being the au-thors of our own lives? Such existential qualms are probably not warranted here,and some of the unease may be vanquished by recognizing that the type ofmodel being developed within a dynamical framework is fundamentally differentfrom that within a perception-cognition-action, or cognitivist, framework1. Thedynamic account of action suggests that the fluid movement typical of skilledcoordination derives its form from the lawful constraints operative on the manycomponents that contribute to the overall behavior. An account is then requiredof the nature of these constraints. Why is that we observe one form of movementand not another? This question is particularly vexing as the behavioral goal thatunderlies the temporary organization of parts into a single-purpose domain oforganization does not, by itself, contain any specification for how that goal is tobe achieved. In reaching to scratch my nose, there are many possible trajectoriesmy hand and arm could take, and one of these actually happens, without anysense of pondering, selection, or doubt.

An important part of the answer is to look more closely at the specification ofthe behavioral goals, and to see to what extent they might serve to differentiateamong possible forms of movement. Given specific goals, some forms of move-ment may be optimal in a strict and quantitative sense, and optimality criteriamay be the best candidates for formal expression of the constraints that areoperative. For example, it has been demonstrated that gait selection in horse lo-comotion makes sense when considered in light of energetic requirements [13]. Foreach of the three gaits studied, walking, trotting, and galloping, the metaboliccost varied with rate. This allowed identification of rates for each gait at whichenergetic costs are minimized. Horses observed in the paddock adopting thesegaits spontaneously did so at rates that are, in fact, approximately optimal. Lo-comotion is a form of action that has been shaped both phylogenetically overmany millennia, and ontogenetically through constant practice. It seems highlyplausible that the resulting action is constrained to be optimal with respect tomany potential criteria. Analysis of bone strain as a function of speed leadsto similar conclusions as the analysis of metabolic cost as indexed by oxygenconsumption [13, 2].

1 One way of describing the difference between the modeling approaches is availablein the Aristotelian distinction between efficient cause, which comfortably accommo-dates the notion of a controller, and formal cause, which describes lawful domainsof organization without the need to commit to any such central executive.

Page 8: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

8 Fred Cummins

We turn now to another characterization of speech coordination within amodel, but framed at quite a different level. We demonstrate that optimalitycriteria may provide some explanatory power in interpreting the form of observedmovement, and may help to bridge the gap between accounts couched in terms ofcontrol or in terms of coordination by reducing the need for explicit interventionby a controller.

4 Embodied Task Dynamics

When we speak, the set of goals we embody can be described in many ways:transmission of a sequence of words, effecting a response in the other, makinga particular kind of sound, etc. One particularly informative way of describingspeech production is as a sequence of articulatory gestures produced in parallelstreams, where the gestures correspond to primitive units of combination withina phonology. This is the basic premise of Articulatory Phonology [3], and hasprovided a powerful explanatory framework for understanding both categoricaland gradient features found in articulation [5]. Fig. 3 shows a partial gesturalscore (by analogy with a musical score) for the utterance /pan/. Each row (or‘tier’) is associated with a (vocal) tract variable. These are linguistically signif-icant dimensions of variation. Note that the times of individual gesture onsetsand offsets do not necessarily align across tiers, as gestures are not simply co-extensive with phonemes. Thus velum lowering precedes tongue tip movementfor the /n/, as is well known from phonetic data. Each tract variable is repre-sented as a simple mass-spring dynamical system for which a target equilibriumposition is provided during periods in which the gesture is active, as specified inthe score. During periods of activation, tract variables move smoothly towardstheir targets, and when the associated gesture is no longer active, they relaxto a neutral position. It is then necessary to map from the space of tract vari-ables (which are all independent of one another and thus context-free) to thespace of articulators, where multiple tract variables may compete for influenceover a specific articulator. For example, the jaw is crucially involved in three ofthe four tract variable movements shown, and for a period, there are conflictinginfluences on the jaw, pushing it lower for the /a/ target, and raising it for /n/.

The tricky business of mapping from tract variables to articulators within anarticulatory synthesis system is provided by the task dynamic model, originallyintroduced in [21] to account for limb movement, and later extended to thespeech domain in [22]. Task dynamics uses techniques from linear algebra touncover the optimal mapping from tract variable motions (arising from the mass-spring dynamics) to model articulator motions. In this way, the nice, smoothmotion resulting from relatively simple dynamical systems (tract variables) canbe manifested in the more complex space of a model vocal tract. Together,articulatory phonology and its task dynamic implementation have been verysuccessful at accounting for a wide range of linguistic phenomena, and at linkingobserved movement to underlying discrete behavioral goals through a principledand explicit mapping [4, 5].

Page 9: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 9

Fig. 3. Partial gestural score for the utterance /pan/. Blocks represent activation in-tervals. Curves represent tract variable movement.

A gestural score contains information about both sequential order, and pre-cise timing. It can thus be interpreted as a sophisticated kind of control algo-rithm, with task dynamics providing the means by which its control variables aremade to do real work. Getting the timing right within the gestural score is thusa difficult problem, and one of fundamental importance. Past approaches havesought to learn appropriate timing from articulatory data using neural networks[22], using an extra layer of planning oscillators [18], and by introducing a degreeof flexibility in relative timing through so-called phase windows [20].

In a recent development of the task dynamic model, Simko and Cummins[25] sought to divide the control task represented in the gestural score into twodistinct parts. The specification of serial order among gestures is specified inde-pendently of the timing among gestures. A modified form of task dynamics thenmakes it possible to express constraints under which the speech is produced, andthese constraints, in turn, allow us to distinguish forms of movement that aremore or less efficient, and thus to find an optimal gestural score that satisfies theovert goal (the gesture sequence) and that reflects the constraints under whichspeech is produced. Fig. 4 shows a simple gestural score before and after opti-mization. The top panel (before) specifies only the linear order of the sequence/abi/, while precise timings have emerged after a process of automatic optimiza-tion in the lower panel. In the radically simplified vocal tract model employedso far, we model only the consonants /p/ and /b/2 and the vowels /a/ and /i/.

Optimization is based upon a cost function with three weighted components:C = αE + αP + αD. Collectively, these express the high-level constraints underwhich speech is produced. The first two components, αE and αP serve to es-tablish a trade off between ease of articulation and communicative effectiveness.αE is a quantification of the effort expended in executing a series of movements,while αP is a cost that rises if articulation is sloppy, or the speech is hard to parseby a listener. Together, the relative magnitude of these weights serves to locate2 Without a glottal model, there is no meaningful distinction within the model between

these stops and their voiced counterparts, /b/ and /d/.

Page 10: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

10 Fred Cummins

Fig. 4. Gestural score before and after optimization.

a given production as lying on a specific point on a scale from hypo-articulationto hyper-articulation [17]. The third component, αD places a relative cost onexecuting an utterance quickly. Full details of the model are presented in [24, 26]and a summary overview can be found in [25].

/i//a/

/b/

−150 −100 −50 0 50 100 150 200 250−15

−10

−5

0

5

10

time (ms)

Fig. 5. Sample movement traces after optimization. Heavy solid line: tongue body;Light solid line: jaw; Dashed lines: lips. Vertical lines demarcate the interval of conso-nant closure.

Fig. 5 illustrates the output of the model. Movement traces are shown forjaw (light solid line), tongue body (heavy solid) and lips (dashed). The verticallines demarcate the interval of consonant closure. Movement of the lips withinthat interval arises from soft tissue compression. This score, and the associatedmovement traces, is derived fully automatically from the sequence specification/abi/ and specific choices of weights for the three elements in the cost function,

Page 11: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 11

C. The resulting form of movement is fluent, and to date, has matched publisheddetails of articulatory movement with a high degree of accuracy [26]. Furtherdevelopment, including extension to a full vocal tract model, will be guided byprecise articulatory data.

The gestural score of the original articulatory phonology/task dynamics modelis a form of control algorithm that specifies both sequence and precise timing.In the more recent embodied model, these jobs are separated. The input to themodel (the residual control element) comprises just the behavioral goal expressedas a sequence of gestures, without explicit timing information. To these is addedthe specification of constraint under which speech is produced (the cost functionweights), and this collectively allows computation of the optimal form of move-ment. This is no longer a control algorithm, but rather an account of the formof observed movement couched in terms of discrete behavioral goals along withhigh-level intentional constraints (weights).

5 Dynamics and Autonomy

In each of the two examples presented herein, an attempt is made to under-stand the form of observed behavior, while remaining somewhat agnostic as tothe underlying (efficient) causes. We observe tight synchrony among speakersand wish to find the best characterization thereof that can account for reducedvariability, sustained synchrony without leaders, and the fragility of the cooper-ative behavior when errors creep in. The conceptual tools of dynamical systems,and in particularly their application to understanding coordination, provide anappropriate vocabulary and instrumentarium for capturing this regularity. Theydo so by taking the identification of the system to be modeled as a critical partof scientific inquiry. Rather than assuming that the sole domain of autonomyin behavior is the individual person, the dynamical approach here posits thetemporary capture of two subjects within a superordinate dyadic domain.

In the embodied task dynamic model, the set of entities that exhibit mutualcoordination changes over time. The boundaries of the set are determined bythe gestural score, which in turn arises from a minimal set of discrete behavioralgoals, and some context-specific constraints. Collectively, these serve to identifyan optimal form of movement, where ‘optimality’ has a precise and quantifiablemeaning. Again, the domain within which lawfulness and constraint operate isnot give a priori, but is rather dependent on time-varying behavioral goals andthe context within which behavior occurs.

The brief accounts provided here to not do justice to either experiment ormodel. For those details, the reader is encouraged to seek out the primary pub-lications referenced herein. The goal here has been to show how the adoption ofa dynamical perspective can help in understanding the structure and lawfulnessof behavior, but that this approach also demands an openness with respect tothe system being modeled. Any dynamic modeling must first be explicit aboutthe set of state variables to be considered. This initial choice positively encour-ages the creation of models that cut across the somewhat arbitrary boundaries

Page 12: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

12 Fred Cummins

that separate brain, body, and environment, and gives the modeler pause forthought about the domain being studied: is it a person, a well-defined subset ofthe person, a person plus a tool, a set of persons? The right answer will, in allcases, be an empirical issue, and will depend on where lawfulness is observed.This flexibility to identify lawfulness at many different levels may ultimatelyencourage us to identify time-varying domains within which components exhibitinterdependent behavior that is both constitutive of, and constrained by, thesystem level organization.

A final word is appropriate to fend off an inevitable potential point of con-fusion. The notion of autonomy within dynamical modeling is clearly separatefrom the concept of agency. The autonomous domain is precisely that set ofvariables that exhibits a lawful set of relations among its components. To iden-tify, e.g. a dyad, as an autonomous domain is to make an observation aboutthe structure and form of their collective behavior. It does not say anything atall about the more mysterious notion of agency. Even though individuals mayact as relatively simple components within an overall pattern, this does not robthem of their intrinsic autonomy. This is clear when we realize that participantsin a Mexican wave are acting as simple components in a collective organization,without any sacrifice of their ability to later leave the game and go buy a hotdog.

References

1. N. A. Bernstein. A new method of mirror cylographie and its application towardsthe study of labor movements during work on a workbench. Hygiene, Safety andPathology lf Labor, 5:3–9 and 6:3–11, 1930. In Russian. Cited in Latash (2008).

2. A. A. Biewener and C. R. Taylor. Bone strain: a determinant of gait and speed?The Journal of Experimental Biology, 123:383–400, July 1986.

3. Catherine P. Browman and Louis Goldstein. Articulatory phonology: An overview.Phonetica, 49:155–180, 1992.

4. Catherine P. Browman and Louis Goldstein. Dynamics and articulatory phonology.In R. F. Port and T. van Gelder, editors, Mind as Motion, chapter 7, pages 175–193.MIT Press, Cambridge, MA, 1995.

5. C.P. Browman and L. Goldstein. Articulatory gestures as phonological units.Phonology, 6(02):201–251, 2008.

6. P. Calvo and A. Gomila. Handbook of Cognitive Science: An Embodied Approach.Elsevier Science Ltd, 2008.

7. A. Chemero and M. Silberstein. After the Philosophy of Mind: Replacing Scholas-ticism with Science. Philosophy of Science, 75:1–27, 2008.

8. F. Cummins. On synchronous speech. Acoustics Research Letters Online, 3(1):7–11, 2002.

9. F. Cummins. Practice and performance in speech produced synchronously. Journalof Phonetics, 31(2):139–148, April 2003.

10. Fred Cummins. Synchronization among speakers reduces macroscopic temporalvariability. In Proceedings of the 26th Annual Meeting of the Cognitive ScienceSociety, pages 304–309, 2004.

11. Fred Cummins. Rhythm as entrainment: The case of synchronous speech. Journalof Phonetics, 37(1):16–28, January 2009.

Page 13: Coordination, not Control, is Central to Movementcspeech.ucd.ie/Fred/docs/cumminsCaserta2010.pdf · 2014-04-17 · Coordination, not Control, is Central to Movement 5 by the coordinated

Coordination, not Control, is Central to Movement 13

12. H. Haken, J. A. S. Kelso, and H. Bunz. A theoretical model of phase transitionsin human hand movement. Biological Cybernetics, 51:347–356, 1985.

13. D.F. Hoyt and C.R. Taylor. Gait and the energetics of locomotion in horses.Nature, 292(5820):239–240, 1981.

14. J. A. Scott Kelso. Dynamic Patterns. MIT Press, Cambridge, MA, 1995.15. J.S. Kelso, B. Tuller, E. Vatikiotis-Bateson, and C.A. Fowler. Functionally specific

articulatory cooperation following jaw perturbations during speech: Evidence forcoordinative structures. Journal of Experimental Psychology: Human Perceptionand Performance, 10(6):812–832, 1984.

16. M.L. Latash. Synergy. Oxford University Press, USA, 2008.17. Bjorn Lindblom. Explaining phonetic variation: a sketch of the H&H theory. In

William J. Hardcastle and A. Marchal, editors, Speech Production and Speech Mod-elling, pages 403–439. Kluwer Academic, Dordrecht, 1990.

18. H. Nam and E. Saltzman. A competitive, coupled oscillator model of syllablestructure. In International Conference on Phonetic Sciences, 2003.

19. Robert Port and Timothy van Gelder, editors. Mind as Motion: Explorations inthe Dynamics of Cognition. Bradford Books/MIT Press, Cambridge, MA, 1995.

20. E. Saltzman and D. Byrd. Task-dynamics of gestural timing: Phase windows andmultifrequency rhythms. Human Movement Science, 19(4):499–526, 2000.

21. Elliot Saltzman and J. A. S. Kelso. Skilled actions: A task dynamic approach.Psychological Review, 94:84–106, 1987.

22. Elliot Saltzman and Kevin Munhall. A dynamical approach to gestural patterningin speech production. Ecological Psychology, 1:333–382, 1989.

23. R. C. Schmidt, C. Carello, and M. T. Turvey. Phase transitions and critical fluctu-ations in the visual coordination of rhythmic movements between people. Journalof Experimental Psychology. Human Perception and Performance, 16(2):227–247,May 1990.

24. Juraj Simko and Fred Cummins. Embodied task dynamics. Psychological Review,2009. In press.

25. Juraj Simko and Fred Cummins. Sequencing of articulatory gestures using costoptimization. In Proceedings of INTERSPEECH 2009, Brighton, U.K., 2009.

26. Juraj Simko and Fred Cummins. Sequencing and optimization within an embodiedtask dynamic model. Cognitive Science, 2010. In revision.

27. B. Tuller, MT Turvey, and H.L. Fitch. The Bernstein perspective: II. The conceptof muscle linkage or coordinative structure. Human Motor Behavior: An Introduc-tion, pages 253–270, 1982.

28. B. Wallace, A. Ross, J. Davies, and T. Anderson. The Mind, the Body and theWorld. Psychology after Cognitivism. Imprint Academic, Charlottesville, VA, 2007.