w02-0201

Upload: mike-travers

Post on 13-Apr-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 W02-0201

    1/10

    Synchronization in an Asynchronous Agent-based Architecturefor Dialogue Systems

    Nate Blaylock and James Allen and George Ferguson

    Department of Computer Science

    University of Rochester

    Rochester, New York 14627USA

    {blaylock,james,ferguson}@cs.rochester.edu

    Abstract

    Most dialogue architectures are ei-ther pipelined or, if agent-based,are restricted to a pipelined flow-of-information. The TRIPS di-alogue architecture is agent-based

    and asynchronous, with several lay-ers of information flow. We presentthis architecture and the synchro-nization issues we encountered inbuilding a truly distributed, agent-based dialogue architecture.

    1 Introduction

    More and more people are building dia-logue systems. Architecturally, these sys-tems tend to fall into two camps: those withpipelined architectures (e.g., (Lamel et al.,1998; Nakano et al., 1999)), and those withagent-based architectures (e.g.,(Seneff et al.,1999; Stent et al., 1999; Rudnicky et al.,1999)). Agent-based architectures are advan-tageous because they free up system com-ponents to potentially act in a more asyn-chronous manner. However, in practice, mostdialogue systems built on an agent-based ar-chitecture pass messages such that they arebasically functioning in terms of a pipelined

    flow-of-information.Our original implementation of the TRIPS

    spoken dialogue system (Ferguson and Allen,1998) was such an agent-based, pipelinedflow-of-information system. Recently, how-ever, we made changes to the system (Allenet al., 2001a) which allow it to take advan-tage of the distributed nature of an agent-based system. Instead of system components

    passing information in a pipelined manner(interpretation discourse management generation), we allow the subsystems of in-terpretation, behavior (reasoning and acting)and generation to work asynchronously. Thismakes the TRIPS system truly distributedand agent-based.

    The driving forces behind these changes areto provide a framework for incremental andasynchronous language processing, and to al-low for a mixed-initiative system at the tasklevel. We describe these motivations brieflyhere.

    Incremental Language Processing In apipelined (or pipelined flow-of-information)system, generation does not occur until af-ter both the interpretation and reasoning pro-

    cesses have completed. This constraint isnot present in human-human dialogue as ev-idenced by the presence of grounding, ut-terance acknowledgment, and interruptions.Making interpretation, behavior, and gener-ation asynchronous allows, for example, thesystem to acknowledge a question while it isstill working on finding the answer.

    Mixed-initiative Interaction Althoughpipelined systems allow the system to takediscourse-level initiative (cf. (Chu-Caroll and

    Brown, 1997)), it is difficult to see how theycould allow the system to take task-level ini-tiative in a principled way. In most systems,reasoning and action are driven mostly by in-terpreted input (i.e.,they are reactive to theusers utterances). In a mixed-initiative sys-tem, the systems response should be deter-mined not only by user input, but also systemgoals and obligations, as well as exogenous

    Philadelphia, July 2002, pp. 1-10. Association for Computational Linguistics. Proceedings of the Third SIGdial Workshop on Discourse and Dialogue,

  • 7/27/2019 W02-0201

    2/10

    events. For example, a system with an asyn-chronous behavior subsystem can inform theuser of a new, important event, regardless ofwhether it is tied to the users last utterance.On the other hand, in the extreme version ofpipelined flow-of-control, behavior cannot do

    anythinguntil the user says something, whichis the only way to get the pipeline flowing.

    The reasons for our changes are describedin further detail in (Allen et al., 2001a). Inthis paper, we focus on the issues we encoun-tered in developing an asynchronous agent-based dialogue system and their respective so-lutions, which turn out to be highly related tothe process of grounding.

    We first describe the general TRIPS archi-tecture and information flow and then discuss

    the various points of synchronization withinthe system. We then discuss what these is-sues mean in general for the implementationof an asynchronous agent-based system.

    2 TRIPS Architecture

    As mentioned above, the TRIPS system1

    (Allen et al., 2000; Allen et al., 2001a; Allenet al., 2001b) is built on an agent-based ar-chitecture. Unlike many systems, however,

    the flow of information within TRIPS is notpipelined. The architecture and informationflow between components is shown in Fig-ure 1. In TRIPS, information flows betweenthe three general areas of interpretation, be-havior, and generation.

    Each TRIPS component is implemented asa separate process. Information is shared bypassing KQML (Finin et al., 1997) messagesthrough a central hub, the Facilitator, whichsupports message logging and syntax checking

    as well as broadcast and selective broadcastbetween components.

    We first discuss the individual system com-ponents and their functions. We then de-scribe the flow of information through the sys-tem and illustrate it with an example.

    1Further details of the TRIPS dia-logue system can be found at our website:http://www.cs.rochester.edu/research/cisd/

    BehavioralAgent

    InterpretationManager

    GenerationManager

    Parser

    Speech

    Planner Scheduler Monitors Events

    Task- and Domain-specificKnowledge Sources

    Exogenous Event Sources

    ResponsePlanner

    GraphicsSpeech

    Task

    Manager

    Reference

    Discourse

    Context

    Interpretation

    Generation

    Behavior

    Task

    InterpretationRequests

    Problem-Solving

    Acts recognizedfrom user

    Problem-Solving

    Actsto perform

    Task

    ExecutionRequests

    Figure 1: The TRIPS Architecture (Allen etal., 2001a)

    2.1 System Components

    Figure 1 shows the various components inthe TRIPS system. Components are dividedamong three main categories: Interpretation,Behavior, and Generation. As shown in thefigure, some components straddle categories,meaning they represent state and provideservices necessary for both sorts of process-ing. The Interpretation Manager (IM) in-terprets user input coming from the variousmodality processors as it arises. It inter-acts with Reference to resolve referring ex-pressions and with the Task Manager (TM)to perform plan and intention recognition, aspart of the interpretation process. It broad-casts recognized speech acts and their inter-pretation as collaborative problem solving ac-tions (see below), and incrementally updatesthe Discourse Context (DC). The BehavioralAgent (BA) is in some sense the autonomousheart of the agent. It plans system be-

    havior based on its own goals and obliga-tions, the users utterances and actions, andchanges in the world state. Actions that re-quire task- and domain-dependent processingare performed by the Task Manager. Ac-tions that involve communication and collab-oration with the user are sent to the Gener-ation Manager (GM) in the form of commu-nicative acts. The GM coordinates planning

  • 7/27/2019 W02-0201

    3/10

    the specific content of utterances and displayupdates and producing the results. Its behav-ior is driven by discourse obligations (from theDC), and the directives it receives from theBA.

    2.1.1 Collaborative Problem Solving

    Model

    The three main components (IM, BA, GM)communicate using messages based on a col-laborative problem solving model of dia-logue (Allen et al., 2002; Blaylock, 2002).We model dialogue as collaboration betweenagents which are planning and acting. To-gether, collaborating agents (i.e., dialoguepartners) build and execute plans, deciding onsuch things as objectives, recipes, resources,

    situations (facts about the world), and soforth. These are calledcollaborative problemsolving objects, and are operated on by col-laborative problem solving acts such as iden-tity(present as a possibility), evaluate, adopt,and others. Thus, together, two agents maydecide to adopt a certain objective, or iden-tify a recipe to use for an objective. Theagreed-upon beliefs, objectives, recipes, andso forth constitute the collaborative problemsolving state.

    Of course, because the agents are au-tonomous, no agent can single-handedlychange the collaborative problem solving(CPS) state. Interaction actsare actions thata single agent performs to attempt to changethe CPS state. The interaction acts are ini-tiate, continue, complete, and reject. Initi-ateproposes a new change to the CPS state.Continue adds new information to the pro-posal, and complete simply accepts the pro-posal (bringing about the change), withoutadding additional information. Of course,

    proposals can berejectedat any time, causingthem to fail.

    As an example, the utterance Letssave the heart-attack victim in Pitts-ford in an emergency planning domainwould be interpreted as two interactionacts: (initiate (identify objective(rescue person1))) and (initiate(adopt objective (rescue person1))).

    Here the user is proposing that they considerrescuing person1 as a possible objective topursue. He is also proposing that they adoptit as an objective to plan for.2

    Interaction actsare recognized (via inten-tion recognition) from speech acts. Inter-

    action acts and speech acts differ in severalways. First, speech acts describe a linguisticlevel of interaction (ask, tell, etc.), whereasinteraction acts deal with a problem solvinglevel (adopting objectives, evaluating recipesand so forth). Also, as shown above, a singlespeech act may correspond to many interac-tion acts.

    2.2 Information Flow in the System

    There are several paths along which informa-

    tion asynchronously flows through the sys-tem. We discuss information flow at the levelsof problem solving, discourse, and grounding.The section that follows then gives an exam-ple of how this proceeds.

    2.2.1 Problem Solving Level

    The problem solving level describes the ac-tual underlying task or purposes of the di-alogue and is based on interaction acts. Wefirst describe the problem solving information

    flow when the user makes an utterance. Wethen discuss the case where the system takesinitiative and how this results in an utteranceby the system.

    User Utterance Following the diagram inFigure 1, when a user makes an utterance,it goes through the Speech Recognizer to theParser, which then outputs a list of speechacts (which cover the input) to the Interpre-tation Manager (IM). The IM then sends thespeech acts to Reference for resolution.

    2Here two interaction acts are posited because ofthe ability of the system to react to each separately,for examplecompletingthe first, but rejectingthe sec-ond. Consider the possible response No, not rightnow. (accept this as a possible objective, but re-ject adopting it right now), versus The 911 centerin Pittsford is handling that, we dont have to worryabout it. (reject this as even a possible objective andreject adopting it). The scope of this paper precludesus from giving more detail about multiple interactionacts.

  • 7/27/2019 W02-0201

    4/10

    The IM then sends these speech act hy-potheses to the Task Manager (TM), whichcomputes the corresponding interaction actsfor each as well as a confidence score that eachhypothesis is the correct interpretation.

    Based on this, the IM then chooses the

    best interpretation and broadcasts3 the cho-sen CPS act(s) in a system understood mes-sage. The TM receives this message andupdates to the new collaborative problemsolving state which this interpretation en-tails. The Behavioral Agent (BA) receives thebroadcast and decides if it wants to form anyintentions for action based on the interactionact.

    Assuming the BA decides to act on theusers utterance, it sends execution and rea-

    soning requests to the TM, which passes themon to the appropriate back-end componentsand returns the result to the BA.

    The BA then forms an interaction act basedon this result and sends it to the GM to becommunicated to the user. The GM then gen-erates text and/or graphical updates based onthe interaction act and utters/presents themto the user.

    In most pipelined and pipelined flow-of-

    information systems, the only flow of infor-mation is at this problem solving level. InTRIPS, however, there are other paths of in-formation flow.

    System Initiative TRIPS is also capableof taking initiative. As we stated above, thisinitiative originates in the BA and can comefrom one of three areas: user utterances, pri-vate system objectives, or exogenous events.

    If the system, say because of an exogenousevent, decides to take initiative and commu-nicate with the user, it sends an interactionact to the GM. The GM then, following thesame path as above, outputs content to theuser.

    3This is a selective broadcasts to the componentswhich have registered for such messages.

    2.2.2 Discourse Level

    The discourse level4 describes informationwhich is not directly related to the task athand, but rather is linguistic in nature. Thisinformation is represented as salience infor-mation (for Reference) and discourse obliga-

    tions (Traum and Allen, 1994).When the user makes an utterance, the in-

    put passes (as detailed above) through theSpeech Recognizer, to the Parser, and then tothe IM, which calls Reference to do resolution.Based on this reference resolved form, the IMcomputes any discourse obligations which theutterance entails (e.g., if the utterance wasa question, to address or answer it, also, toacknowledge that it heard the question).

    At this point, the IM broadcasts an sys-

    tem heard message, which includes incurreddiscourse obligations and changes in salience.Upon receipt of this message, Discourse Con-text updates its discourse obligations and Ref-erence updates its salience information.

    The GM learns of new discourse obligationsfrom the Discourse Context and begins to tryto fulfill them, regardless of whether or notit has heard from the BA about the prob-lem solving side of things. However, thereare some obligations it will be unable to ful-

    fill without knowledge of what is happeningat the problem solving level answering oraddressing the question, for example. How-ever, other obligations can be fulfilled withoutproblem solving knowledge an acknowledg-ment, for example in which case, the GMproduces content to fulfill the discourse obli-gation.

    If the GM receives interaction acts anddiscourse obligations simultaneously, it mustproduce content which fulfills both problem

    solving and discourse needs. Usually, theseinteraction acts and discourse obligations aretowards the same objective an obligationto address or answer a question, and an inter-action act of identifying a situation (commu-

    4Although it works in a conceptually similar way,the current system does not handle discourse level in-formation flow quite so cleanly as is presented here.We intend to clean things up and move to this exactmodel in the near future.

  • 7/27/2019 W02-0201

    5/10

    nicating the answer to the user), for example.However, because the system has the abilityto take initiative, these interaction acts anddiscourse obligations may be disparate anobligation to address or answer a question andan interaction act to identify and adopt a new

    pressing objective, for example. In this case,the GM must plan content to fulfill the actsand obligations the best it can apologizefor not answering the question and then in-forming the user, for example. Through thismethod, the GM maintains dialogue coher-ence even though the BA is autonomous.

    2.2.3 Grounding Level

    The last level of information flow is at thelevel that we loosely call grounding (Clarkand Schaefer, 1989; Traum, 1994).5 In

    TRIPS, acts and obligations are not accom-plished and contexts are not updated unlessthe user hasheardand/orunderstoodthe sys-tems utterance.6

    Upon receiving a new utterance, the IMfirst determines if it contains evidence of theuser having heard and understood the utter-ance.7 If the user heard and understood, theIM broadcasts a user heard message whichcontains both salience information from theprevious system utterance as well as what dis-

    course obligations the system utterance ful-filled. This message can be used by Referenceto update salience information and by Dis-course Context to discharge fulfilled discourseobligations.

    It is important that these contexts not beupdated until the system know that the userheard its last utterance. If the user for ex-ample, walks away as the system speaks, thesystems discourse obligations will still notfulfilled, and salience information will not

    5TRIPS only uses a small subset of Traumsgrounding model. In practice, however, this has notpresented problems thus far.

    6The acceptance or rejection of the actual contentof an utterance is handled by our collaborative prob-lem solving model (Allen et al., 2002; Blaylock, 2002)and is not further discussed here.

    7Hearing and understanding are not currently rec-ognized separately in the system. For future work, wewould like to extend the system to handle them sepa-rately (e.g.,the case of the user having heard but notunderstood).

    change.

    The GM receives the user heard mes-sage and also knows which interaction act(s)the system utterance was presenting. Itthen broadcasts a user understood message,which causes the TM to update the collabo-

    rative problem solving state, and the BA torelease any goals and intentions fulfilled bythe interaction act(s).

    Again, it is important that these contextupdates do not occur until the system has ev-idence that the user understood its last utter-ance (for reasons similar to those discussedabove).

    This handling of grounding frees the sys-tem from the assumptions that the user al-ways hears and understands each utterance.

    2.3 An Example

    We use here an example from our TRIPSMedication Advisor domain ((Ferguson et al.,2002)). The Medication Advisor is a projectcarried out in conjunction with the Cen-ter for Future Health at the University ofRochester.8 The system is designed to helppeople (especially the elderly) understand andmanage their prescription medications.

    With the huge growth in the number of

    pharmaceutical therapies, patients tend toend up taking a combination of several differ-ent drugs, each of which has its own charac-teristics and requirements. For example, eachdrug needs to be taken at a certain rate: oncea day, every four hours, as needed, and so on.Some drugs need to be taken on an emptystomach, others with milk, others before orafter meals, and so on. Overwhelmed withthis large set of complex interactions manypatients simply do not (or cannot) complywith their prescribed drug regimen (Claxtonet al., 2001).

    The TRIPS Medication Advisor is designedto help alleviate this problem by giving pa-tients easy and accessible prescription infor-mation an management in their own home.

    For our example, we assume that a dialoguebetween the system and user is in progress,

    8http://www.centerforfuturehealth.org

  • 7/27/2019 W02-0201

    6/10

    and a number of other topics have been ad-dressed. At this certain point in the conver-sation, the system has just uttered Thanks,Ill try that and now the user utters the fol-lowing:

    User: Can I take an aspirin?

    We trace information flow first at thegrounding level, then at the discourse level,and finally at the problem solving level. Thisinformation flow is illustrated in Figure 2.

    Grounding Level The utterance goesthrough the Speech Recognizer and Parser tothe IM. As illustrated in Figure 2a, based onthe utterance, the IM recognizes that the userheard and understood the systems last ut-

    terance, so it sends a user heard message,which causes the Discourse Context to updatediscourse obligations and Reference to updatesalience based on the systems last utterance.

    The GM receives the user heard mes-sage and sends the corresponding user un-derstood message, containing the interactionact(s) motivating the systems last utterance.Upon receiving this message, the TM updatesthe collaborative problem solving state, andthe BA updates its intentions and goals.

    Meanwhile ... things have been happeningat the discourse level.

    Discourse Level After the IM sends theuser heard message, as shown in Figure 2b,it sends Reference a request to resolve refer-ences within the users utterance. It then rec-ognizes that the user has asked a question,which gives the system the discourse obliga-tions of answering (or addressing) the ques-tion, as well as acknowledging the question.

    The IM then sends a system heard

    message, which causes Reference to updatesalience and Discourse Context to store thenewly-incurred discourse obligations.

    The GM receives the new discourse obliga-tions, but has not yet received anything fromthe BA about problem solving (see below).Without knowledge of what is happening inproblem solving, the GM is unable to ful-fill the discourse obligation to answer (or ad-

    dress) the question. However, it is able to ful-fill the obligation of acknowledging the ques-tion, so, after a certain delay of no responsefrom the BA, the GM plans content to pro-duce an acknowledgment, which causes theavatar9 to graphically show that it is think-

    ing, and also causes the system to utter thefollowing:

    System: Hang on.

    Meanwhile ... things have been happeningat the problem solving level as well.

    Problem Solving Level After it sends thesystem heard message, as shown in Fig-ure 2c, the IM computes possible speech actsfor the input. In this case, there are two: a

    yes-no question about the ability to take as-pirin and a request to evaluate the action oftaking aspirin.

    These are sent to the TM for intentionrecognition. The first case (the yes-no ques-tion) does not seem to fit the task model welland receives a low score. (The system prefersinterpretations in which the user wants infor-mation for a reasonand not just for the sakeof knowing something.) The second speechact is recognized as an initiate of an evalua-tion of the action of taking aspirin (i.e., theuser wants to evaluate this action with thesystem). This hypothesis receives a higherscore.

    The IM chooses the second interpretationand broadcasts a system understood mes-sage that announces this interpretation. TheTM receives this message and updates itscollaborative problem solving state to reflectthat the user did this interaction act. TheBA receives the message and, as shown inFigure 2d, decides to adopt the intention of

    doing the evaluation and reporting it to theuser. It sends an evaluation request for the ac-tion of the user taking an aspirin to the TM,which queries the back-end components (userknowledge-base and medication knowledge-base) about what prescriptions the user hasand if any of them interact with aspirin.

    9The TRIPS Medication Advisor avatar is a talkingcapsule whose top half rotates when it is thinking.

  • 7/27/2019 W02-0201

    7/10

    IM Ref

    TM BA

    GM

    User understood (0)

    User heard (0)

    IM Ref

    TM BA

    GM

    IM Ref

    TM BA

    GM

    Resolve

    Reply

    System heard (1);Obligation to Ack

    Obligation to Answer

    Hang on (2)

    Address obligto Ack

    System understood (1);CPS Act: evaluate-action

    Interpret

    Reply

    IM Ref

    TM BA

    GM

    No, you are taking (3)

    Address obligto Answer

    Perform PS Act

    Result

    Inform userof result

    S: Thanks, Ill try that. (0)

    U: Can I take an aspirin? (1)

    (a) (b)

    (c)(d)

    Figure 2: Flow of Information for the Utterance Can I take an aspirin? (a) Grounding Level,(b) Discourse Level, (c) and (d) Problem-Solving Level

    The back-end components report that theuser has a prescription for Celebrex, and thatCelebrex interacts with aspirin. The TM thenreports to the BA that the action is a badidea.

    The BA then formulates an interaction actreflecting these facts and sends it to the GM.The GM then produces the following utter-ance, which performs the interaction act aswell as fulfills the discourse obligation of re-sponding to the question.

    System: No, you are taking Celebrexand Celebrex interacts withaspirin.

    3 Synchronization

    The architecture above is somewhat idealizedin that we have not yet given the details ofhow the components know which context tointerpret messages in and how to ensure thatmessages get to components in the right or-der.

    We first illustrate these problems by giving

    a few examples. We then discuss the solutionwe have implemented.

    3.1 Examples of Synchronization

    Problems

    One of the problems that faces most dis-tributed systems is that there is no sharedstate between the agents. The first problemwith the architecture described in Section 2 isthe lack of context in which to interpret mes-sages. This is well illustrated by the interpretrequest from the IM to the TM.

    As discussed above, the IM sends its candi-date speech acts to the TM, which performsintention recognition and assigns a score. Theproblem is, in which context should the TM

    interpret utterances? It cannot simply changeits collaborative problem solving state eachtime it performs intention recognition, since itmay get multiple requests from the IM, onlyone of which gets chosen to be the official in-terpretation of the system.

    We have stated that the TM updates itscontext each time it receives a system under-stood or user understood message. This

  • 7/27/2019 W02-0201

    8/10

    brings up, however, the second problem ofour distributed system. Because all compo-nents are operating asynchronously (includ-ing the user, we may add), it is impossibleto guarantee that messages will arrive at acomponent in the desired order. This is be-

    cause desired order is a purely pragmaticassessment. Even with a centralized Facili-tator through which all messages must pass,the only guarantee is that messages from aparticular component to a particular compo-nent will arrive in order; i.e.,if component Asends component B three messages, they willget there in the order that component A sentthem. However, if components A and C eachsend component B a message, we cannot saywhich will arrive at component B first.

    What this means is that the current con-text of the IM may be very different from thatof the TM. Consider the case where the sys-tem has just made an utterance and the useris responding. As we describe above, the firstthing the IM does is check for hearing and un-derstanding and sends off a user heard mes-sage. The GM, when it receives this message,sends the corresponding user understoodmessage, which causes the TM to update to acontext containing the systems utterance.

    In the meantime, the IM is assuming the

    context of the systems last utterance, as itdoes interpretation. It then sends off inter-pret requests to the TM. Now, if the TM re-ceives an interpret request from the IM be-

    foreit receives the user understood messagefrom the GM, it will try to interpret the in-put in the context of theusers last utterance(as if the user had made two utterance in arow, without the system saying anything inbetween). This situation will give erroneousresults and must be avoided.

    3.2 Synchronization Solution

    The solution to these problems is, of course,synchronization: causing components to waitat certain stages to make sure they are inthe same context. It is interesting to notethat these synchronization points are highlyrelated to a theory of grounding and commonground.

    To solve the first problem listed above (lackof context), we have components append con-text assumptions to the end of each message.Thus, instead of the IM sending the TM arequest to interpret B, it sends the TM a re-quest to interpret B in the context of hav-

    ing understood A

    . Likewise, instead of theIM requesting that Reference resolve D, it re-quests that Reference resolve D having heardC. Having messages explicitly contain con-text assumptions allows components to inter-pret messages in the correct context.

    With this model, context now becomes dis-crete, incrementing with every chunk ofcommon ground.10 These common groundupdates correspond exactly to the heardand understood messages we describedabove. Thus, in order to perform a certaintask (reference resolution, intention recogni-tion, etc.), a component must know in whichcommon ground context it must be done.

    The solution to the second problem (mes-sage ordering) follows from explicitly listingcontext assumptions. If a component receivesa message that is appended with a contextabout which the component hasnt receivedan update notice (the heard or under-stood message), the component simply de-fers processing of the message until it has re-

    ceived the corresponding update message andcan update its context. This ensures that, al-though messages may not be guaranteed toarrive in the right order, they will be pro-cessed in the right context. This providesthe necessary synchronization and allows theasynchronous system components to work to-gether in a coherent manner.

    4 Discussion

    We believe that, in general, this has sev-

    eral ramifications for any agent-based, non-pipelined flow-of-information architecture.

    1. Agents which are queried about morethan one hypothesis must keep state for

    10For now we treat each utterance as a singlechunk. We are interested, however, in moving tomore fine-grained models of dialogue. We believe thatour current architecture will still be useful as we moveto a finer-grained model.

  • 7/27/2019 W02-0201

    9/10

    all hypotheses until one is chosen.

    2. Agents cannot assume shared context.Because both the system componentsand user are acting asynchronously, itis impossible in general for any agent to

    know what context another agent is cur-rently in.

    3. Agents must be able to defer working oninput. This feature allows them to waitfor synchronization if they receive a mes-sage to be interpreted in a context theyhave not yet reached.

    Asynchronous agent-based architectures al-low dialogue systems to interact with users in

    a much richer and more natural way. Unfor-tunately, the cost of moving to a truly dis-tributed system is the need to deal with syn-chronization. Fortunately, for dialogue sys-tems, models of grounding provide a suitableand intuitive basis for system synchroniza-tion.

    5 Conclusion and Future Work

    In this paper we presented the TRIPS dia-logue system architecture: an asynchronous,agent-based architecture, with multiple lay-ers of flow-of-information. We also discussedthe problems with building this distributedsystem. As it turns out, models of ground-ing provide a foundation for necessary systemsynchronization.

    For future work we plan to clean up themodel in the ways we have discussed above.We are also interested in moving to a more in-cremental model of grounding, where ground-ing can take place and context can change

    within sentence boundaries. Also, we are in-terested in extending the model to handleasynchronous issues at the turn-taking level.For example, what happens to context whena user barges in while the system is talking, orif the user and system speak simultaneous fora time. We believe we will be able to lever-age our asynchronous model to handle thesecases.

    6 Acknowledgments

    We would like to thank Amanda Stent, whowas involved with the original formulation ofthis architecture. We also wish to thank theanonymous reviewers for their helpful com-ments.

    This material is based upon work supportedby Department of Education (GAANN) grantno. P200A000306; ONR research grant no.N00014-01-1-1015; DARPA research grantno. F30602-98-2-0133; NSF grant no. EIA-0080124; and a grant from the W. M. KeckFoundation.

    Any opinions, findings, and conclusions orrecommendations expressed in this materialare those of the authors and do not necessar-ily reflect the views of the above-mentioned

    organizations.

    References

    J. Allen, D. Byron, M. Dzikovska, G. Ferguson,L. Galescu, and A. Stent. 2000. An archi-tecture for a generic dialogue shell. Journalof Natural Language Engineering special issueon Best Practices in Spoken Language DialogueSystems Engineering, 6(3):116, December.

    James Allen, George Ferguson, and AmandaStent. 2001a. An architecture for more real-

    istic conversational systems. In Proceedings ofIntelligent User Interfaces 2001 (IUI-01), pages18, Santa Fe, NM, January.

    James F. Allen, Donna K. Byron, MyroslavaDzikovska, George Ferguson, Lucian Galescu,and Amanda Stent. 2001b. Towards conversa-tional human-computer interaction. AI Maga-zine, 22(4):2737.

    James Allen, Nate Blaylock, and George Fergu-son. 2002. A problem solving model for col-laborative agents. In First International JointConference on Autonomous Agents and Multi-

    agent Systems, Bologna, Italy, July 15-19. Toappear.

    Nate Blaylock. 2002. Managing communica-tive intentions in dialogue using a collaborativeproblem solving model. Technical Report 774,University of Rochester, Department of Com-puter Science, April.

    Jennifer Chu-Caroll and Michael K. Brown. 1997.Initiative in collaborative interactions itscues and effects. In S. Haller and S. McRoy,

  • 7/27/2019 W02-0201

    10/10

    editors, Working Notes of AAAI Spring 1997Symposium on Computational Models of MixedInitiative Interaction, pages 1622, Stanford,CA.

    Herbert H. Clark and Edward F. Schaefer. 1989.Contributing to discourse. Cognitive Science,13:259294.

    A. J. Claxton, J. Cramer, and C. Pierce. 2001.A systematic review of the associations be-tween dose regimens and medication compli-ance. Clinincal Therapeutics, 23(8):12961310,August.

    George Ferguson and James F. Allen. 1998.TRIPS: An intelligent integrated intelligentproblem-solving assistant. InProceedings of theFifteenth National Conference on Artificial In-telligence (AAAI-98), pages 567573, Madison,WI, July.

    George Ferguson, James Allen, Nate Blaylock,Donna Byron, Nate Chambers, MyroslavaDzikovska, Lucian Galescu, Xipeng Shen,Robert Swier, and Mary Swift. 2002. The Med-ication Advisor project: Preliminary report.Technical Report 776, University of Rochester,Department of Computer Science, May.

    Tim Finin, Yannis Labrou, and James Mayfield.1997. KQML as an agent communication lan-guage. In J. M. Bradshaw, editor, SoftwareAgents. AAAI Press, Menlo Park, CA.

    L. Lamel, S. Rosset, J. L. Gauvain, S. Bennacef,

    M. Garnier-Rizet, and B. Prouts. 1998. TheLIMSI ARISE system. InProceedings of the 4thIEEE Workshop on Interactive Voice Technol-ogy for Telecommunications Applications, pages209214, Torino, Italy, September.

    Mikio Nakano, Noboru Miyazaki, Jun ichi Hira-sawa, Kohji Dohsaka, and Takeshi Kawabata.1999. Understanding unsegmented user ut-terances in real-time spoken dialogue systems.In Proceedings of the 37th Annual Meeting ofthe Association for Computational Linguistics(ACL-99), pages 200207.

    A. I. Rudnicky, E. Thayer, P. Constantinides,C. Tchou, R. Shern, K. Lenzo, W. Xu, andA. Oh. 1999. Creating natural dialogsin the carnegie mellon communicator system.In Proceedings of the 6th European Confer-ence on Speech Communication and Technology(Eurospeech-99), pages 15311534, Budapest,Hungary, September.

    Stephanie Seneff, Raymond Lau, and Joseph Po-lifroni. 1999. Organization, communication,

    and control in the Galaxy-II conversational sys-tem. InProceedings of the 6th European Con-

    ference on Speech Communication and Tech-nology (Eurospeech-99), Budapest, Hungary,September.

    Amanda Stent, John Dowding, Jean MarkGawron, Elizabeth Owen Bratt, and Robert

    Moore. 1999. The CommandTalk spoken dia-logue system. InProceedings of the 37th AnnualMeeting of the Association for ComputationalLinguistics (ACL-99).

    David R. Traum and James F. Allen. 1994.Discourse obligations in dialogue processing.In Proceedings of the 32nd Annual Meeting ofthe Association for Computational linguistics(ACL-94), pages 18, Las Cruces, New Mexico.

    David R. Traum. 1994. A computational theoryof grounding in natural language conversation.Technical Report 545, University of Rochester,

    Department of Computer Science, December.PhD Thesis.