control architecture and experiment of a situated robot...

6
Control Architecture and Experiment of A Situated Robot System for Interactive Assembly Jianwei Zhang [email protected] Faculty of Technology, University of Bielefeld, 33501 Bielefeld, Germany Alois Knoll [email protected] Technical University of Munich, 81667 Munich, Germany Abstract We present the development of and experiment with a robot system showing cognitive capabilities of children of three to four years. We focus on two topics: assembly by two hands and understanding human instructions in natural language as a precondition for assembly systems being perceived by humans as “intelligent”. A typical application of such a system is in- teractive assembly. A human communicator sharing a view of the assembly scenario with the robot instructs the latter by speaking to it in the same way that he would communicate with a child. His instructions can be under-specified, incom- plete and/or context-dependent. After introducing the general purpose of our project, we present the hardware and software components of our robots necessary for interactive assembly tasks. The control archi- tecture of the robot system with two stationary robot arms is discussed. We then describe the functionalities of the instruc- tion understanding, planning and execution levels. The imple- mentations of a layered-learning methodology, memories and monitoring functions are briefly introduced. Finally, we out- line a list of future research topics for extending our system. 1 Introduction Human-beings interact with each other in a multimodal way. By reviewing the history of robotics, the modalities of human- robot interaction can be classified into three levels: explicit level, implicit level, and inter-human like level. With the en- hancement of robot intelligence and advance of human per- ception, human-robot interaction can be developed naturally and inter-human like. A user can instruct a robot by using nat- ural language (NL), gesture and gaze information in the way he communicates with a human partner. Technologies leading towards such a natural interaction with robots will contribute to the extension of robotic applications to all human-in-the- loop systems such as service robots, medical robots, entertain- ment robots, software robots, etc. In mechatronic applications, the “machine intelligence quotient” (MIQ) can be enhanced so that untrained persons can use such high functional devices easily. For building a robot system which understands human natural instructions, a robot control architecture which enables multimodal input, global memory access and fault monitoring becomes a central topic. 2 Some relevant work One challenge of the research program for robotics is to auto- mate the process of multisensor supported assembly by gradu- ally enabling the robot and sensor system to carry out the indi- vidual steps in a more and more autonomous fashion. The typ- ical hierarchical RCS architecture for realizing such systems was explained in details in [1]. However, a fully automatic assembly under diverse uncertain conditions can rarely be re- alized without any failure. Several projects on communicative agents realized with real robots have been reported, e.g. [8]. In the projects described in [2] and [10], natural language inter- faces were used as the “front-end” of an autonomous robot. If constrained natural language is used to realise a limited num- ber of robot operations, special steps can be taken, e.g. by only recongizing nouns in an instruction and listing the pos- sible actions based on a pre-defined knowledge database [11]. In the SAIL project [10], level-based AA-learning combined with attention-selection and reinforcement signals was intro- duced to let a mobile robot learn to navigate and to recog- nize human faces and simple speech inputs. In [7], the main system architectures were compared, and an object-based ap- proach was proposed to help manage the complexity of intelli- gent machine development. In the Cog project [3], the sensory and motor systems of a humanoid robot and the implemented active sensing and social behaviors were studied. To overcome the limitations of this approach, the concept of the “Artificial Communicator” was developed, which we briefly outline in the sequel. 3 The Communicator Approach If the nature of assembly tasks cannot be fully predicted, it becomes inevitable to decompose them into more elementary actions. Ideally, the actions specified are atomic in such a way that they always refer to only one step in the assembly of ob- jects or aggregates, i.e. they refer to only one object that is to be assembled with another object or collection thereof (ag- 3906

Upload: others

Post on 08-Jan-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

Control Ar chitectureand Experiment of A SituatedRobot SystemforInteracti ve Assembly

[email protected]

Facultyof Technology, Universityof Bielefeld,33501Bielefeld,Germany

Alois [email protected] Munich,

81667Munich,Germany

Abstract

We presentthe developmentof andexperimentwith a robotsystemshowing cognitive capabilitiesof childrenof threetofour years. We focuson two topics: assemblyby two handsandunderstandinghumaninstructionsin naturallanguageasapreconditionfor assemblysystemsbeingperceivedby humansas“intelligent”. A typical applicationof sucha systemis in-teractive assembly. A humancommunicatorsharinga viewof theassemblyscenariowith therobot instructsthe latterbyspeakingto it in the sameway that he would communicatewith a child. His instructionscanbe under-specified,incom-pleteand/orcontext-dependent.

After introducing the general purposeof our project, wepresentthe hardwareandsoftwarecomponentsof our robotsnecessaryfor interactive assemblytasks. The control archi-tectureof the robot systemwith two stationaryrobot armsisdiscussed.We thendescribethefunctionalitiesof theinstruc-tion understanding,planningandexecutionlevels.Theimple-mentationsof a layered-learningmethodology, memoriesandmonitoringfunctionsarebriefly introduced.Finally, we out-line a list of futureresearchtopicsfor extendingoursystem.

1 Intr oduction

Human-beingsinteractwith eachotherin a multimodalway.By reviewing thehistoryof robotics,themodalitiesof human-robot interactioncan be classifiedinto threelevels: explicitlevel, implicit level, and inter-humanlike level. With the en-hancementof robot intelligenceand advanceof humanper-ception,human-robotinteractioncanbe developednaturallyandinter-humanlike. A usercaninstructarobotby usingnat-ural language(NL), gestureandgazeinformationin the wayhecommunicateswith ahumanpartner. Technologiesleadingtowardssucha naturalinteractionwith robotswill contributeto the extensionof robotic applicationsto all human-in-the-loopsystemssuchasservicerobots,medicalrobots,entertain-mentrobots,softwarerobots,etc. In mechatronicapplications,the “machine intelligencequotient” (MIQ) can be enhancedsothatuntrainedpersonscanusesuchhigh functionaldeviceseasily. For building a robotsystemwhich understandshumannaturalinstructions,arobotcontrolarchitecturewhichenables

multimodalinput,globalmemoryaccessandfaultmonitoringbecomesacentraltopic.

2 Somerelevant work

Onechallengeof theresearchprogramfor roboticsis to auto-matetheprocessof multisensorsupportedassemblyby gradu-ally enablingtherobotandsensorsystemto carryout theindi-vidualstepsin amoreandmoreautonomousfashion.Thetyp-ical hierarchicalRCSarchitecturefor realizingsuchsystemswas explainedin detailsin [1]. However, a fully automaticassemblyunderdiverseuncertainconditionscanrarelybere-alizedwithoutany failure.Severalprojectsoncommunicativeagentsrealizedwith realrobotshavebeenreported,e.g.[8]. Intheprojectsdescribedin [2] and[10], naturallanguageinter-faceswereusedasthe“front-end” of anautonomousrobot. Ifconstrainednaturallanguageis usedto realisea limited num-ber of robot operations,specialstepscan be taken, e.g. byonly recongizingnounsin an instructionandlisting the pos-sibleactionsbasedon a pre-definedknowledgedatabase[11].In the SAIL project[10], level-basedAA-learning combinedwith attention-selectionandreinforcementsignalswasintro-ducedto let a mobile robot learn to navigate and to recog-nize humanfacesandsimplespeechinputs. In [7], the mainsystemarchitectureswerecompared,andanobject-basedap-proachwasproposedto helpmanagethecomplexity of intelli-gentmachinedevelopment.In theCogproject[3], thesensoryandmotorsystemsof a humanoidrobotandtheimplementedactivesensingandsocialbehaviorswerestudied.

To overcomethe limitations of this approach,the conceptof the “Artificial Communicator”was developed,which webriefly outlinein thesequel.

3 The Communicator Approach

If the natureof assemblytaskscannotbe fully predicted,itbecomesinevitable to decomposetheminto moreelementaryactions.Ideally, theactionsspecifiedareatomicin suchawaythat they alwaysrefer to only onestepin theassemblyof ob-jectsor aggregates,i.e. they refer to only oneobject that isto beassembledwith anotherobjector collectionthereof(ag-

3906

Page 2: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

gregates).Theentiretyof asystemthattransformssuitablein-structions� into suchactionsis calledanartificial communica-tor (AC). It consistsof sensorsubsystems,NL processing,cog-nitive integrationandtheroboticactors.Fromtheinstructor’spointof view theAC shouldresembleahumancommunicator(HC) ascloselyaspossible[6]. The AC mustbe seamlesslyintegrated into the handling/manipulationprocess.More im-portantly, it mustbesituated, whichmeansthatthesituationalcontext (i.e. thestateof theAC andits environment)of a cer-tainNL (andfurthermodalities)input is alwaysconsideredforits interpretation.The processof interpretation,in turn, maydependon thehistoryof utterancesup to acertainpoint in theconversation.It may be helpful, for example,to clearlystatethegoalof theassemblybeforeproceedingwith a descriptionof theatomicactions.Thereare,however, situationsin whichsucha “stepwiserefinement”is counter-productive,e.g. if thefinal goalcannotbeeasilydescribed.Studiesbasedon obser-vationsof childrenperformingassemblytaskshave proventobe useful in developingpossibleinterpretationcontrol flows.From an engineeringperspective, the two approachescanbelikenedto openloopcontrol (Front-EndApproach)andclosedloop control (IncrementalApproach)with thehumaninstruc-tor beingpartof theclosedloop.

Theresearchdescribedin thefollowing sectionsis embeddedinto a larger interdisciplinaryresearchproject aiming at thedevelopmentof ACsfor variouspurposesandinvolving scien-tistsfrom thefieldsof computerlinguistics,cognitive linguis-tics,computerscienceandelectricalengineering.

4 The SituatedArtificial Communicator

Thereis ampleevidencethatthereexistsastronglink betweenhumanmotorskill andcognitive development(e.g. [5]). Ourabilities of emulation,mentalmodelingandplanningof mo-tion arecentralto humanintelligence[4] and,by the way, apreconditionfor anticipation,but they alsocritically dependon the experiencewe make with our own body dynamicsaswe plasticallyadaptour body’s shapeto theenvironment.Asabasicscenario,theassemblyprocedureof atoy aircraft(con-structedwith “Baufix” parts,seeFig.1) wasselected.Wehavebeendevelopinga two-armroboticsystemto modelandreal-ize humansensorimotorskills for performingassemblytasksandto facilitatehumaninteractionwith languageandgestures.This robotic systemserves as the major test-bedof the on-going interdisciplinaryresearchprogramof the projectSFB1

360“SituatedArtificial Communicators”at theUniversityofBielefeld [13]. A numberof partsmustbe recognized,ma-nipulatedand built togetherto constructthe model aircraft.Within theframework of theSFB,in eachof thesesteps,ahu-mancommunicatorinstructstherobot,which impliesthat theinteractionbetweenthemplaysanimportantrole in thewholeprocess.

1Collaborative researchunit fundedby the DeutscheForschungsgemein-schaft(DFG).

(a) The Baufix constructionparts.

(b) Thegoalaggregate.

Figure 1: Theassemblyof a toy aircraft.

Figure 2: The two-arm multisensorrobot systemfor dialogue-guidedassembly.

The physical set-upof this systemconsistsof the followingcomponents(Fig. 2):

(i) Two 6 d.o.f. PUMA-260 manipulatorsare installedover-headin astationaryassemblycell. Oneachwrist of themanip-ulator, a pneumaticjaw-gripper with integratedforce/torquesensorand“self-viewing” hand-eye system(local sensors)ismounted.

(ii) Two cameraswith controllablezoom,auto-focusandaper-tureprovide themainvision function. Their tasksareto build2D/3D world models,to supervisegrossmotion of the robotaswell asto tracethehandandviewing directionof thehumaninstructor.

(iii) A microphoneand loudspeakers are connectedwith astandardvoicerecognitionsystem,IBM ViaVoice, to recognizethehumanspeechinstructionsandto synthesizethegeneratedspeechoutput.

3907

Page 3: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

5 Control Ar chitecture

As thebackboneof anintelligentsystem,thecontrolarchitec-tureof a complex technicalsystemdescribesthefunctionalityof individual modulesand the interplay betweenthem. WedevelopedaninteractivehierarchicalarchitectureaccordingtoFig. 3. A HC is closelyinvolved in the wholeassemblypro-cess.

5.1 High-level functionsThe systemand the HC interactthroughnaturalspeechandwith handgestures.First,aninstructionis spokento therobotsystemand recognizedwith the ViaVoice speechengine. Inthecurrentsystem,ViaVoicerecognizesonly sentences,whichthe grammarwe developedallows. In practice,hundredsofgrammarrulescanbe used. If the recognitionsucceeds,theresultsareforwardedto thespeechrecognition/understandingmodule.

By their verynature,humaninstructionsaresituated,ambigu-ous,andfrequentlyincomplete. In mostcases,however, thesemanticanalysisof suchutteranceswill resultin sensibleop-erations.An exampleis thecommand“Grasptheleft screw”.The systemhasto identify the operation(grasp), the objectfor this operation(screw), andthesituatedspecificationof theobjects(left).

With the help of a handgesturethe operatorcanfurther dis-ambiguatetheobject.Thesystemmaythenusethegeometricknowledgeof theworld to identify theright object.Othersit-uatedexamplesare:“ Insertin theholeabove”, “Screw thebaron thedownsidein thesamewayason theupside”, “Put thatthere”, “Rotateslightly further to the right”, “Do it again”,etc.

The outputof the analysisis thenverified to checkif the in-tendedoperationcan be carriedout. If in doubt, the robotagentasksfor furtherspecificationsor it hastheright to pickan objectby itself. Oncethe properoperationis determined,it is given to the coordinationmoduleon the next level. Thefinal resulton this level consistsof an ElementaryOperation(EO)andtheobjectsto bemanipulatedwith themanipulation-relevant informationsuchastype,position/orientation,color,pose(standing,lying, etc).

An EO is definedin this systemasan operationwhich doesnot needany furtheractionplanning.Typical EOsare:grasp,place, insert into, put on, screw, regrasp, alignment(for anillustration seeFig. 4). The robustnessof theseoperationsmainly dependson thequalityof thedifferentskills.

5.2 Planning tasksOn the planninglevel, an assemblytask of the toy aircraft,or of sub-aggregates,is decomposedinto a sequenceof EOs.Thefinal decisionaboutthemotionsequencedependson theinstructionsof the humanuseraswell asthe generatedplan.Theplanningmoduleshouldnotonlybeabletounderstandthehumaninstructions,but alsoto learnfrom thehumanguidance

(a) Graspa screw (b) Regrasp

(c) Placeanaggregate (d) Putapartin

(e)Screw (f) Alignment

Figure 4: Examplesof elementaryoperations.

andimprove its planningabilitiesgradually.

Theplanningmoduleon theschedulinglevel receivesanEOfrom the instruction understanding. By referencingthe ac-tion memory, theplanningmodulechoosesthecorrespondingbasicprimitive sequencefor the operation.This sequenceisa script of basicprimitives for implementingthe given EO.The taskhereincludesplanningof thenecessarytrajectories,choosingtheright robot(s)andbasicexceptionhandling.

Sequencesareexecutedby thesequencer, whichactivatesdif-ferentskills on thenext executionlevel. Theplanningmodulealso receivesan event report that is generatedby the execu-tion level. If the event is a failure detection,the monitoringmoduleis informed.In thenormaloperations,themonitoringmoduleupdatesthe actionmemory. It alsodetectsthe eventfailures. If it is found that the robot canre-dothe operation,the planningmodulewill try again. Otherwise,the monitor-ing modulesendsa requestto the dialog moduleto ask thehumancommunicatorhow to handlethe exceptionandwaitsfor an instruction. After the executionof eachoperation,theknowledge baseis updated.

3908

Page 4: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

��� ��� ����� �� �

� ���� �� ��� ���

�������� ! ��#"$�%& ' & "�%()�*+�, -#. /�)�0*�1 )�2 3�46587�59 5:46;:<87:= >?= ;87

@ABCEDF

GHIIJK HLMN OJPQMR

ST6U#VEW6X TX TY[Z6S�\] VE] Z�\

^_`^abcad abe f

c_cabg

hijkkl

mnoppq

rts uwv xyzx{|}y}{|~

� �� ����� �������

� ����� � �����

�������

����� ���������

����  ¡��¢¡£�¤

¥¦ §¨©ª«©¬­

®¯°°±² ³°´°µ¶·¸ ¹´

º»8¼�º½¿¾ÁÀÃÂ8¼�ÂÅÄÅ»¿ÀÆ»¿¼�Ç

ÈÉ Ê�Ë�Ì�ÍÎ�Ï�Ð Ñ[Ï�Î�Ò É Ó�ÔÕ�Ö�××6Ø�ÙÖ�×�Ú Ø[×�Ö�Û Ü Ý�Þß?à8á�à�âã�â ä�âEå

æ�ç�è6èé�êç�è�ë é[è�ç[ìEí î�ï ð ñ ò�ó�ô�õö�÷�ø ù[÷�ö�ú ñ û�ü

ýÿþ ����� ��� þ �� �� �� �

����������������� �

����� !#"$�%�$�!�& '( )+*-,/.

021435156 798;:=<�>2?@89AB798DC EGFIHKJ9LNM�LPO;QSRTF5UVQXW=R

Y[Z/\^]`_�aVbc]`YedfZga;Z/hjif_�\^]�kfZ/hPY[Z/l

m5npoGq5q5rsqDt

uXvxwzySvz{T|S} {S~

�N�G�5�����G�X���D�

� �+�-�/�

���+�-�/�

Figure 3: An architectureof theSituatedArtificial Communicatorsfor instructionunderstandingandexecution.

5.3 Execution levelThe sequencingmoduleon the schedulinglevel usesthe as-semblyskills providedby theexecutionlevel to performa se-quence. The complexity of the skills can rangefrom open-ing the handto collision-freecontrol of the two armsto themeetingpoint. Advancedskills arecomposedof oneor morebasicskills. Generally, threedifferentkindsof skills areclassi-fied: (i) Motor skills: Openandclosegripper; Drive joint to;Drive arm to; Rotategripper; Move arm in approach direc-tion; Movecamera, etc. (ii) Sensorskills: Get joint; Getpo-sition in world; Getforcein approach direction;Gettorques;Check if a specificpositionis reachable; Take a camera pic-ture; Detectobject; Detectmoving robot; Track an object,etc. (iii) Sensorimotorskills: Force-guardedmotion; Vision-guidedgrossmovementto a goal position;Visual servoingofthegripper to optimalgraspingposition,etc.

5.4 Layered-learningLearningtheinterplayof perception,positioningandmanipu-lation aswell asbasiccognitive capabilitiesis thefoundationof a smoothexecutionof a commandsequenceof a humaninstructor. If a commandrefers to an EO, the disambigua-tion of the instructionbasedon multimodal input is the keyprocess.Theautonomoussensor-basedexecutionof thesein-structionsrequiresadaptive,multi-sensorbasedskills with anunderstandingof acertainamountof linguistic labels.If com-plex instructionsareused,however, the robot systemshouldpossesscapabilitiesof skill fusion, sequencegenerationandplanning. It is expectedto generatethe sameresult after arepeatedinstructioneven if the situationhaschanged. Thelayered-learningapproachis the schemeto meet this chal-lenge.

3909

Page 5: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

Layered-learningis a hierarchicalself-improving approachtorealize� multimodalrobotcontrol, in particularadaptive, mul-tisensorbasedskills. Under this concept,tasksare decom-posedfrom highto low level. Realsituatedsensorandactuatorsignalsare locatedon the lowest level. Both self-supervisedandreinforcementlearninghave beenappliedto theB-splinemodel[12] to realizemostof thesensorimotorskills. Throughtask-orientedlearningthelinguistic termsto describetheper-ceivedsituationsaswell asrobotmotionsaregenerated.Skillsfor manipulationandassemblyareacquiredby learningonthislevel usinganeuro-fuzzymodel.Furthermore,thelearningre-sultson thelower levelsserve asthebasisof thehigherlevelssuchasEOs,sequences,strategies,planningandfurthercog-nitivecapabilities.

To learntheoperationsequencesautomaticallyfor two arms,wedevelopedamethodfor learningcooperativetasks.If asin-glerobotis unableto graspanobjectin acertainorientation,itcanonly continuewith thehelpof otherrobots.Thegraspingcanbe realizedby a sequenceof cooperative operationsthatre-orientthe object. Several sequencesareneededto handlethedifferentsituationsin which anobjectis not graspablefortherobot. It is shown thatadistributedlearningmethodbasedonaMarkov decisionprocessis ableto learnthesequencesforthe involved robots,a masterrobot that needsto graspandahelpingrobot thatsupportsit with there-orientation.A novelstate-actiongraphis usedto storethereinforcementvaluesofthelearningprocess.

5.5 MemoriesTo describethe knowledge base,both semanticand proce-dureknowledgeareused.In our currentimplementationsuchknowledgeis still hard-coded.It canbeviewedaslong-term-memoryto a certaindegree,which will beextendedby learn-ing approachesin our future researchactivities. Short-term-memoriesexist in perceptionmodules,which are usedforscenerecognition,dialogpreparationandaction(sensorimotorfunctions). Learningof anotherimportanttype of memories,theepisodicmemory, is preliminarily studiedin theassemblyscenarios.

Accordingto theempiricalinvestigations,theepisodicmem-ory representsoneof the most importantcomponentsof hu-man intelligence. The reminding,mentalsimulationaswellasplanningusetheepisodicmemoryasthebasis.Thediversemultisensordatawith largebandwidthof therobotsuchasvi-sionsystem,joint angles,positions,forceprofilesetc.,cannotbesavedin their roughformatfor arbitrarily longtime. There-fore,codingapproachesbasedonappearancesandfeaturesaresuggested[9] for summarizingandgeneralizingexperiencesfrom thesuccessfullyperformedoperations.Themultisensortrajectoriesandthemotorsignalsareusedfor “grounding” thelearnedoperationsequences.

5.6 MonitoringMonitoringplaysanimportantrole to makeanintelligentsys-tem robust. It is also usedfrequentlyby a human-beingin

manipulationandspeaking,especiallyin a new environmentor for a new task. Monitoring and eventually re-planningfor repairingresult in the non-linearityof the understanding-planning-execution cycle, but they representone essentialfunctionin thecognitive architectureof a robot.Furthermore,it is meaningfulto addadiagnosisfunctionwhichcanprovidehypothesesaboutthereasonsof diversefailures.

Theunexpectedeventsduring the robotactioncanbe for ex-ample:A forceexceedsa definedthreshold;A camera detectsno object; Singularity; Collision; etc. If suchan event hap-pens,it is reportedto theplanninglevel.

6 Dialogueand AssemblyResults

Oneexampleto build the “elevator control” aggregateof theaircraft out of threeelementaryobjectsby carrying out dia-logueswasstudied.Theobjectswerelaid outonthetable,andthereweremany moreobjectspositionedin arbitraryorderonthetablethannecessary. TheHC hada completeimagein hismindof whattheassemblysequenceshouldbe.Alternatively,hecouldhave usedtheassemblydrawingsin theconstructionkit’s instructionsandtranslatedtheminto NL.

After theAC findingout if all objectsarepresentandaftergo-ing throughanoptionalobjectnamingproceduretheHC inputfirst triggerstheactionplanner, whichdecideswhichobjecttograspandwhich robot to use. Sincethe HC did not specifyeitherof theseparameters,bothareselectedaccordingto theprinciple of economy. In this case,they areso chosenastominimize robot motion. The motion plannerthencomputesa trajectory, which is passedto the robots. Sincethereareenoughboltsavailable,theAC issuesits standardrequestforinputoncethebolt is pickedup.

HC input resultsin theotherrobotpicking up theslat. Beforethis mayhappen,however, it hasto beclearedup, which slatto take. This involvesthe incorporationof thegesturerecog-niser. Then the screwing is triggered,involving the peg-in-holemodulementionedabove followedby thescrewing mod-ule. For reasonsof spacethesubsequentstepsof thedialoguehave to omittedhere;they show how errorhandlingandmanyotheroperationscanbe performed– mostof which humansare not awareof when they expect machinesdo do “what Imean”.Fig. 5 shows two typical objectsthatcanbebuilt withthesetupasdevelopedup to now.

7 Futur e Work

Amongmany topicsto beexplored,someimportantonescanbelistedasfollows:

� The long-term-memoryis learnedfrom theshort-term-memory so that symbols, sequences,namesand at-tributesareanchoredin therealsensor/actuatorworld.

3910

Page 6: Control Architecture and Experiment of a Situated Robot ...mediatum.ub.tum.de/doc/1290559/33854.pdfControl Architecture and Experiment of A Situated Robot System for Interactive Assembly

1

3

2

6

4 5

Figure5: Sampleaggregatesmadeby our interactiveassemblysys-tem.

� Methodsneedto bedevelopedfor increasingthecapa-bility and quality of reinforcementsignalsand fitnessevaluationof the learningsystem. Active sensingandactivemanipulationcanfind their applicationsfor thesepurposes.

� To enablethe arbitrarytransitionbetweendigital mea-surementsandconcepts,symbolicsparsecoding,gran-ular computing,fuzzysetsandroughsetswill beinves-tigatedandintegrated.

� Action sequenceslearnedon thebasisof verbalandvi-sualinstructionsandsummarizationneedto bebuilt intoanappropriaterepresentationsothatthey canbegener-alizedfor analogor evennew tasks.

� Learningon thehigherlevel shouldbeconductedto se-lect actionstrategiesandto generateintelligentdialogs.Thiswill needthetight integrationof morecomponentsandmoreknowledgeshown in Fig. 3.

� More functionssuchasamotivationor creationmoduleneedto be addedin the architectureso that the robot’sinitiativescanbeusedinsteadof passive acceptanceofinstructions.

Acknowledgment

This researchis supportedundergrantSFB360by DFG, theGermanResearchCouncil.

References

[1] J.S.Albus.Theengineeringof mind. In Proceedingsofthe Fourth InternationalConferenceon Simulationof Adap-tiveBehavior:FromAnimalsto Animats, September1996.

[2] R. Bischoff and V. Graefe. Integrating vision, touchand natural languagein the control of a situation-orientedbehavior-basedhumanoidrobot. In IEEE InternationalCon-ferenceonSystems,Man,Cybernetics,Tokyo, 1999.

[3] R.A. Brooks,C.Breazeal,M. Marjanovic, andB. Scas-sellati. TheCogproject:Building a humanoidrobot. In C. L.Nehaniv, editor, Computationfor Metaphores, Analogy andAgents, volume1562of Lecture Notesin ComputerScience,pages52–87.Springer, 1999.

[4] A. Clark andR. Grush. Towardsa cognitive robotics.AdaptiveBehavior, 7(1):5– 16,1999.

[5] G. Lakoff. Women,Fire, andDangerousThings:WhatCategories Reveal About the Mind. University of ChicagoPress,1990.

[6] R. Moratz, H. Eikmeyer, B. Hildebrandt, A. Knoll,F. Kummert,G. Rickheit, and G. Sagerer. Selective visualperceptiondriven by cuesfrom speechprocessing.In Proc.EPIA 95, Workshopon Appl. of AI to Rob. and Vision Syst.,TransTechPublications,1995.

[7] R. T. Pack, M. Wilkes,G. Biswas,andK. Kawamura.Intelligentmachinearchitecturefor object-basedsysteminte-gration.In Proceedingsof theIEEE/ASMEInternationalCon-ferenceonAdvancedIntelligentMechatronics, June1997.

[8] K. R. Thorissen.CommunicativeHumanoids- A Com-putationalModelof PsychosocialDialogueSkills. PhDthesis,MIT MediaLab., 1997.

[9] Y. vonCollani,J.Zhang,andA. Knoll. A generallearn-ing approachto multisensorbasedcontrolusingstatisticalin-dices.In Proceedingsof the2000IEEEConf. onRoboticsandAutomation,SanFrancisco,California, April 2000.

[10] J. Weng, C. H. Evans,W. S. Hwang, and Y.-B. Lee.The developementalapproachto artificial intelligence: Con-cepts,developmentalalgorithmsandexperimentalresults. InIn Proc.NSFDesign& ManufacturingGranteesConference,1999.

[11] T. Yamada,J. Tatsuno,andH. Kobayashi.A practicalwayto applythenaturalhumanlikecommunicationto human-robot interface. In Proceedingsof 10th IEEE InternationalWorkshopon RobotandHumanCommunication, pages158–163,Bordeaux-Paris,September2001.

[12] J. ZhangandA. Knoll. A Neuro-FuzzyLearningAp-proach to VisuallyGuided3D PositioningandPoseControl ofRobotArms. In “Biologically InspiredRobotBehavior Engi-neering”,editedby R. Duro,J.SantosandM. Grana,SpringerVerlag,2001.

[13] J. Zhang,Y. von Collani, andA. Knoll. Interactive as-semblyby a two-arm robot agent. Journal of RoboticsandAutonomousSystems, 29:91–100,1999.

3911