jamris 2013 vol 7 no 4

pISSN 1897-8649 (PRINT) / eISSN 2080-2145 (ONLINE)

VOLUME 7 N° 4 2013 www.jamris.org

Articles 1

Journal of automation, mobile robotics & intelligent systems

Publisher:Industrial Research Institute for Automation and Measurements PIAP

Editor-in-Chief

Janusz Kacprzyk (Systems Research Institute, Polish Academy of Sciences; PIAP, Poland)

Co-Editors:

Oscar Castillo (Tijuana Institute of Technology, Mexico)

Dimitar Filev (Research & Advanced Engineering, Ford Motor Company, USA)

Kaoru Hirota (Interdisciplinary Graduate School of Science and Engineering,

Tokyo Institute of Technology, Japan)

Witold Pedrycz (ECERF, University of Alberta, Canada)

Roman Szewczyk (PIAP, Warsaw University of Technology, Poland)

Executive Editor:

Anna Ladan [email protected]

Editorial Board:Chairman: Janusz Kacprzyk (Polish Academy of Sciences; PIAP, Poland)

Mariusz Andrzejczak (BUMAR, Poland)

Plamen Angelov (Lancaster University, UK)

Zenn Bien (Korea Advanced Institute of Science and Technology, Korea)

Adam Borkowski (Polish Academy of Sciences, Poland)

Wolfgang Borutzky (Fachhochschule Bonn-Rhein-Sieg, Germany)

Chin Chen Chang (Feng Chia University, Taiwan)

Jorge Manuel Miranda Dias (University of Coimbra, Portugal)

Bogdan Gabrys (Bournemouth University, UK)

Jan Jablkowski (PIAP, Poland)

Stanisław Kaczanowski (PIAP, Poland)

Tadeusz Kaczorek (Warsaw University of Technology, Poland)

Marian P. Kazmierkowski (Warsaw University of Technology, Poland)

Józef Korbicz (University of Zielona Góra, Poland)

Krzysztof Kozłowski (Poznan University of Technology, Poland)

Eckart Kramer (Fachhochschule Eberswalde, Germany)

Piotr Kulczycki (Cracow University of Technology, Poland)

Andrew Kusiak (University of Iowa, USA)

Mark Last (Ben–Gurion University of the Negev, Israel)

Patricia Melin (Tijuana Institute of Technology, Mexico)

Tadeusz Missala (PIAP, Poland)

Fazel Naghdy (University of Wollongong, Australia)

Zbigniew Nahorski (Polish Academy of Science, Poland)

Antoni Niederlinski (Silesian University of Technology, Poland)

Witold Pedrycz (University of Alberta, Canada)

Duc Truong Pham (Cardiff University, UK)

Lech Polkowski (Polish-Japanese Institute of Information Technology, Poland)

Alain Pruski (University of Metz, France)

Leszek Rutkowski (Czestochowa University of Technology, Poland)

Klaus Schilling (Julius-Maximilians-University Würzburg, Germany)

Ryszard Tadeusiewicz (AGH University of Science and Technology in Cracow, Poland)

Stanisław Tarasiewicz (University of Laval, Canada)

Piotr Tatjewski (Warsaw University of Technology, Poland)

Władysław Torbicz (Polish Academy of Sciences, Poland)

Leszek Trybus (Rzeszów University of Technology, Poland)

René Wamkeue (University of Québec, Canada)

Janusz Zalewski (Florida Gulf Coast University, USA)

Marek Zaremba (University of Québec, Canada)

Teresa Zielinska (Warsaw University of Technology, Poland)

Associate Editors:

Jacek Salach (Warsaw University of Technology, Poland)

Maciej Trojnacki (Warsaw University of Technology and PIAP, Poland)

Statistical Editor:

Małgorzata Kaliczynska (PIAP, Poland)

Editorial Office:

Industrial Research Institute for Automation

and Measurements PIAP

Al. Jerozolimskie 202, 02-486 Warsaw, POLAND

Tel. +48-22-8740109, [email protected]

Copyright and reprint permissions

Executive Editor

The reference version of the journal is printed

If in doubt about the proper edition of contributions, please contact the Executive Editor. Articles are reviewed, excluding advertisements

and descriptions of products.

All rights reserved ©

Articles2

EditorialCezary Zieliński, Krzysztof Tchoń

On the Application of Elastic Band Method to Repeatable Inverse Kinematics in Robot ManipulatorsIgnacy Dulęba, Michał Opałka

DOI: 10.1431/JAMRIS_4-2013/5

Impedance Controllers for Electric-Driven RobotsEdward Jezierski, Artur Gmerek

DOI: 10.1431/JAMRIS_4-2013/13

Safe Strategy of Door Opening with Impendance Controlled ManipulatorTomasz Winiarski, Konrad Banachowicz, Maciej Stefańczyk

DOI: 10.1431/JAMRIS_4-2013/21

3D Camera and Lidar Utilization for Mobile Robot NavigationMaciej Stefańczyk, Konrad Banachowicz, Michał Walęcki and Tomasz Winiarski

DOI: 10.1431/JAMRIS_4-2013/28

Optimization-Based Approach for Motion Planning of a Robot Walking on Rough TerrainDominik Belter

DOI: 10.1431/JAMRIS_4-2013/35

Experimental Verification of a Walking Robot Self-Localization System with the Kinect SensorMichał Nowicki, Piotr Skrzypczyński

DOI: 10.1431/JAMRIS_4-2013/43

Extensive Feature Set Approach in Facial Expression Recognition in Static ImagesMateusz Żarkowski

DOI: 10.1431/JAMRIS_4-2013/53

Speech Emotion Recognition System for Social RobotsŁukasz Juszkiewicz

DOI: 10.1431/JAMRIS_4-2013/59

JOURNAL Of AUTOMATION, MOBILE ROBOTICS & INTELLIGENT SYSTEMS

VOLUME 7, N° 4, 2013 DOI: 10.1431/JAMRIS_4-2013

3

13

21

34

27

42

52

59

5

CONTENTS

Journal of Automation, Mobile Robotics & Intelligent Systems VOLUME 7, N° 4 2013

Articles 3

Editorial

Interaction of Robots with the EnvironmentThe intrinsic property of robots is their ability to interact with the environment. The type of the environment in which robots operate can be employed as one of many classification criteria distinguishing them into broad classes, i.e. industrial, service or field robots. This distinction is implied by the degree of structuring that the environment exhibits. Industrial robots operate in very structured environment, like in the factories, service robots populate offices and homes, where the location of furniture or people changes, while field robots operate in forests or fields where the structure is hard to define whatsoever. The structure of the environment in which the robot operates implies how it interacts with its surroundings. This issue of Journal of Automation, Mobile Robotics and Intelligent Systems (JAMRIS) is devoted to the subject of how robots interact with the environment depending on its structure. A closer inspection of this topic should lead to the deduction of commonalities influencing the design of future robot control and perception systems.

The idea of publishing an issue of JAMRIS devoted to interaction of robots with the environment emerged in the discussions during the 12th National Conference on Robotics organized by the Department of Fundamental Cybernetics and Robotics, Institute of Computer Engineering, Control and Robotics, Wrocław University of Technology in Świeradów-Zdrój, from the 12th to the16th of September 2012. This conference showed that the Polish robotics community has produced significant results in this field, and thus the Program Committee decided that it would be beneficial, if those achievements would be put together within one publication, so that they could complement each other. Hence, a selected group of authors working on diverse aspects of robot-environment interaction was invited to submit papers describing their research results. Based on the conference presentations, the papers contained in this issue of JAMRIS are their extended and more comprehensive English versions, subjected to the regular JAMRIS review procedure.

Gratitude should be expressed to all of the reviewers who provided in depth comments enabling many clarifications and overall improvement of the contents of the papers presented in this topical issue of JAMRIS.

This volume of JAMRIS is composed of 8 papers that provide insight into four aspects of robot-environment interaction: • interaction of redundant robots executing repeatable tasks within the factory environment,• physical interaction of manipulators with rigid objects in diverse environments, including home and office

settings,• interaction of mobile robots or walking machines with semi-structured and unstructured environments,• interaction of robot companions with people.

The first paper of this volume is devoted to the interaction of redundant robots with a well structured environment, in which usually repeatable tasks are executed. Ignacy Dulęba and Michał Opałka authored the paper On Application of Elastic Band Method to Repeatable Inverse Kinematics in Robot Manipulators, in which they tackle the problem of generating closed loop trajectories for an end-effector motion of a redundant manipulator, in such a way that the initial and the terminal manipulator configurations will be the same. This problem arises when a redundant manipulator has to execute repeatable motions in the operational and the respective configuration space. For that purpose the elastic band method is used to design a repeatable inverse kinematics algorithm for robot manipulators. The final solution is obtained through optimization techniques. By simulating two robots the authors compare their solution with a standard pseudo-inverse Jacobian algorithm, which does not preserve repeatability of inverse kinematics.

Next two papers are devoted to manipulators physically interacting with rigid objects. This requires force and torque control, which can be obtained through changing the mechanical impedance properties of the robot. The papers present both how such controllers should be designed and the utilization of such controllers in task execution.

Edward Jezierski and Artur Gmerek in the paper entitled Impedance Controllers for Electric-Driven Robots provide an insight into the design of robot actuation control exhibiting compliance. Based on the information


Articles4

about approaching an obstacle, obtained from a proximity sensor, the mechanical impedance of the manipulator is adjusted, thus inducing into the device the required compliance, hence reducing the collision impact. Experimental results obtained by controlling a brushless DC motor and simulation of a 2 dof manipulator illustrate and validate the proposed control algorithms.

The article entitled Safe Strategy of Door Opening with Impedance Controlled Manipulator, by Tomasz Winiarski, Konrad Banachowicz and Maciej Stefańczyk, concentrates on a single activity of service robots, namely their ability to open doors of rooms or cupboards. To that end impedance control law taking into account force and velocity constraints imposed by the manipulated object was employed. Those constraints are due the safety of the performed operation and directly influence the control law. The structure and operation of the designed controller is described formally. The conducted experiments validate the proposed control strategy.

The third group of papers focuses on the interaction of diverse mobile robots with their environments. As their autonomous navigation requires both adequate perception of the environment and motion planning, those are the subjects investigated by those papers.

The results of research conducted by Maciej Stefańczyk, Konrad Banachowicz, Michał Walęcki and Tomasz Winiarski are presented in the paper 3D Camera and Lidar Utilization for Mobile Robot Navigation. The authors deal with a navigation system based on the Kinect sensor and a laser scanner capable of detecting obstacles in indoor environments. The control system behaviour is specified in terms of transition functions. The localization of the robot relative to the map is performed by an Extended Kalman filter taking as its arguments three sources of data: wheel encoders, gyroscope, and the laser scanner. Global robot position relative to the provided approximate map is calculated, using a particle filter. The functioning of the system is verified in a cluttered office environment in which the mobile robot operates.

The paper Optimization-Based Approach for Motion Planning of a Robot Walking on Rough Terrain authored by Dominik Belter, describes an algorithm generating leg-end and body trajectories for a hexapod robot walking on an uneven terrain. For that purpose it uses Rapidly Exploring Random Tree Connect and Particle Swarm Optimization algorithms. Static stability of the machine and motion constraints imposed by obstacles are taken into account. The quality of the elaborated algorithms was tested by simulation.

The work of Michał Nowicki and Piotr Skrzypczyński entitled Experimental Verification of a Walking Robot Self-Localization System with the Kinect Sensor concerns the identification of egomotion of a RGB-D sensor mounted on a hexapod robot. The presented algorithm relies purely on the depth information, i.e. a point cloud. The considerations fall into a broad category related to the design of Simultaneous Localization and Mapping algorithms. A two-stage point cloud matching procedure exploiting salient features in the range data is employed. It first uses feature-based matching procedure utilising Normal Aligned Radial Feature detectors/descriptors followed by Iterative Closest Points algorithm.

The fourth group of papers deals with natural interaction of robots with people who inhabit the environment, thus from the point of view of an autonomous robot are simply a part of it. Acceptable interaction of robots and humans will be possible, only if they can socialize, thus the study of emotions is a prerequisite to this. Hence, the identification of emotions from facial expressions as well as speech tone and prosody is the topic of the next two papers.

Mateusz Żarkowski in the paper entitled Extensive Feature Set Approach in Facial Expression Recognition in Static Images concentrates on facial emotion recognition of a person interacting with a robot. The system recognizes seven human emotions. The system relies on feature extraction and selection, which facilitates the definition of new features and adding them to the feature set. Face tracking and parametrization is performed by FaceTracker software, while feature selection is done by data mining software.

The paper Speech Emotion Recognition System for Social Robots, authored by Łukasz Juszkiewicz, focuses on social robots, e.g. robot companions. Emotions of the person interacting with the robot are recognised using global acoustic features of speech. Four phases of Polish or German speech processing are described, namely: calculation of speech parameters, acoustic feature extraction, feature selection and classification. The paper makes an interesting observation that purely acoustic recognition system has problems with distinguishing anger and joy, thus could result in a grave misjudgement of response by a social robot.

All of the mentioned particular aspects of robot-environment interaction are at the forefront of the currently ongoing research into this subject. Each of the papers gives a valuable insight into a particular problem, providing its formulation and deriving a solution. This selection of papers reveals the wide scope and diversity of contemporary robotics.

Cezary ZielińskiKrzysztof Tchoń

Journal of Automation, Mobile Robotics & Intelligent Systems VOLUME 7, N◦ 4 2013

O A E B M R IK R M

O A E B M R IK R M

O A E B M R IK R M

O A E B M R IK R M

Submi ed: 20th January 2013; accepted: 4th June 2013

Ignacy Duleba, Michal Opalka

DOI: 10.14313/JAMRIS_4-2013/5

Abstract:In this paper an idea of the elas c band method was ex-ploited to design a repeatable inverse kinema cs algo-rithm for robot manipulators. The method incorporatesan op miza on process at many stages of its perfor-mance and admits some extensions. Performance of thealgorithmwas illustrated onmodels of the three DOF pla-nar pendulum and the PUMA robot. A comparison with astandard pseudo-inverse Jacobian algorithm, which doesnot preserve repeatability of inverse kinema cs, is alsoprovided.

Keywords: inverse kinema cs, repeatability, algorithm

1. Introduc onRobot manipulators are frequently used to per-

form repeatable tasks (assembling, drilling, painting)on a factory loor. From a technological point of view,it is desirable to follow exactly the same trajectoryby a manipulator when performing a repeatable task.In this case the problem of avoiding obstacles can besigni icantly reduced as only one trajectory should betested for the collision avoidance. Therefore, one ofthe most popular tasks in robot manipulators is a re-peatable inverse kinematic task. The task is to ind aloop (closed path) in a con iguration space that corre-sponds to a prescribed cycle of the end-effector in ataskspace [1]. This task is meaningful only for redun-dantmanipulators because for non-redundantmanip-ulators a solution of the inverse kinematic task is de-termined uniquely if only a trajectory does not meetany singular con iguration.

A standard approach to construct repeatable in-verse kinematic algorithms relies on extending orig-inal (redundant) kinematics by some extra functions(n − m items) to get extended kinematics which isnon-redundant [10]. Outside singular con igurationsof the original kinematics, the added functions shouldpreserve full rank of the square Jacobian matrix of theextended kinematics (in other words, they should notintroduce extra singular con igurations not present inthe original kinematics).

This constraint is not too restrictive thus there ex-ist many functions to ful ill the aforementioned min-imal requirement. So, naturally, one wants to get anoptimal set of functions with respect to a prescribedquality function [5]. This line was exploited in Karpin-ska’s PhD thesis [4].

In this paper we follow a quite different approachthat does not add any extra functions and avoids a

computationally expensive problem of their optimiza-tion. The proposed approach is based on the elasticband method, developed by Quinlan and Khatib [8],which was primarily designed to optimize a path ofa mobile robot [2]. The idea of this method is the fol-lowing: an initial admissible path is assumed, then it isdeformed iteratively to optimize a given quality func-tion until a stop condition is satis ied. The deforma-tion was originally based on a potential ield methodthat tried to avoid obstacles, bymaximizing a distancefrom the path to obstacles, and make the path shortand smooth. This idea is very close to a more gen-eral approach known in mathematics as a continua-tion method [9]. The continuation method constructa inal solution also in an iterative process. The irstapproximation (maybe not admissible, i.e. not satisfy-ing all requirements imposed on the original task) ofthe solution should be known (and usually it is sim-ple todetermine). Then, a sequence of solutions is con-structedwhichdecrease a certain quality function thatmeasures how far the current solution is from that onewhich satis ies all requirements. In the limit a solutionof the original task is determinedwhich satis ies all re-quirements imposed.

In our adaptation of the elastic band method torepeatable inverse kinematic tasks a role of the de-formed pathwill play a cyclic trajectory in the con igu-ration space. It will be modi ied in such a way that themodi ied trajectory remains cyclic and its image, viaforward kinematics, is closer to the prescribed cyclicpath in the taskspace.

In this paper we use the term trajectory as asynonym for a function in a con iguration spaceparametrizedwith the same variable that parametrizethe prescribed path in the taskspace. Usually the tra-jectory is described as a function of time.

The paper, being the extended version of [3], is or-ganized as follows. In Section 2 the repeatable inversekinematic task is de ined formally. In this section analgorithm based on the elastic band method to solvethe task is discussed with details. Simulation resultsof the proposed algorithm are collected in Section 3.Section 4 concludes the paper.

2. Elas c band method in repeatable inversekinema c taskForward kinematics of a robot manipulator is a

mapping [11]

k : Q ∋ q → x = k(q) ∈ X ⊂ SE(3), (1)

5


where q is a con iguration living in the con igurationspaceQ and x denotes a point in the taskspaceX . Forredundant manipulators dimensionallity of the con-iguration space is larger than dimensionallity of thetaskspace,

dimQ = n > m = dimX. (2)With thekinematics (1) a Jacobianmatrix is associatedand given by J(q) = ∂k/∂q.

Let a loop (a closed path){x(s), s ∈ [0, smax], x(0) = x(smax)} (3)

be given in the taskspace and also an initial con igura-tion q0 is knowncorresponding to the start point of theloop, k(q0) = x0. The task of repeatable inverse kine-matics is to ind a trajectory q(s) in the con igurationspace satisfying∀s∈[0,smax] k(q(s)) = x(s), q(0) = q(smax) = q0. (4)

Additionally, the sum of squares of lengths of the sub-trajectories in the con iguration space

smax∫0

(∂q

∂s

)T (∂q

∂s

)ds =

smax∫0

⟨∂q∂s

,∂q

∂s⟩ds. (5)

shouldbeminimized. In (5)T stands for transposition,and ⟨·, ·⟩ introduces the inner product. We prefer toevaluate a trajectory according to Eq. (5) as it is fasterto compute than to determine the total length of thetrajectory.

The basic algorithm exploited in solving the afore-mentioned task is the Newton algorithm designed tosolve the inverse kinematics when the goal is to reacha single point in the taskspace. The algorithm is de-scribed by the iteration process [7,11] with a possibleoptimization in the null space of the Jacobian

qi+1 = qi + ξ1 · J#(qi) (xf − k(qi))+

+ξ2(I − J#(qi)J(qi)

) ∂f(q)∂q |q=qi ,

(6)

where J# = JT (JJT )−1 is the pseudo-inverse of theJacobian matrix, q0 is the initial con iguration of themanipulator, xf denotes a current goal point in thetaskspace, I is the (n× n) identity matrix and a scalarfunction f(q) is to be optimized (the optimization canbe switched-off by setting the value of ξ2 to 0.

Below main steps of the algorithm based on theelastic band method applied to repeatable inversekinematic task are presentedStep 1 Select an initial loop in the con iguration space

q(s), s ∈ [0, smax], (q(0) = q(smax)).

The loop becomes a current trajectory.Step 2 De ine an error function that assigns to a cyclic

trajectory q(·) accumulated distance of its image(via forward kinematics) to the traced path x(·)in the taskspace

err(q(·)) =smax∫

s=0

||x(s)− k(q(s))||ds, (7)

where || · || is an assumed norm.

Step 3 Compute the error (7) for the current trajec-tory.

Step 4 If the error is below an acceptable threshold,complete the computations andoutput the result-ing current trajectory. Otherwise progress withStep 5.

Step 5 Modify the current trajectory (either at a sin-gle point or on an interval) preserving cyclicity ofthe trajectorywith the aim to decreased the valueof error (7). Themodi ied trajectory becomes thenew current trajectory. Continue with Step 3.

More details concerning the algorithm follow.- A few terms and notations extensively used later onare introduced:- a node point is any point placed on the given pathx(·),

- a node con iguration is a con iguration which im-age via forward kinematics is a node point,

- a node con iguration q when included into a tra-jectory q(s) gets the same argument s as the corre-sponding node point x(s). The argument s of x(s)is determined uniquely,

- (q(s), x(s)) = Newton(q0, x(s), f(q)) denotesthat the Newton algorithm (6)with the initial con-iguration q0, the ixed goal pointx(s) (as s is ixed)and minimizing the quality function f(q) was runand produced as its output the pair node con igu-ration - node point (q(s), x(s)).

- A selection of the initial (passing through q0) cyclictrajectory is unrestricted. However, its selection im-pacts both the speedof convergence aswell as its op-timality. The best natural choice (and simplest oneas well) is a trivial loop, i.e. ∀s ∈ [0, smax] q(s) = q0.

- Consecutive node con igurations are interpolatedlinearly, although higher order polynomials are alsopossible.

- An invariant of the algorithm (characteristics thatdoes not change from one iteration to another) isthe cyclicity of the current trajectory. Initially thereis a trivial loop q(0) → q(smax) = q(0). Afteradding a single node con iguration, say q(smax/2)corresponding to x(smax/2), the closed loops is thefollowing q(0) → q(smax/2) → q(smax) = q(0), etc.

- The norm used in Eq. (7) depends on the nature ofthe taskspace. When all coordinates are positional,the Euclidean norm will be the most proper. Whenthe taskspace includes also angle coordinates somenorm inherited from the special Euclidean groupSE(3) [12] should be used. To simplify computa-tions, the integral (7) can be replaced by a sum oflocal errors for arguments s between currently ex-isting node con igurations.

- The key step of the algorithm, Step 5, can be imple-mented at least in two possible ways:

Scheme 1: on a prescribed path x(·) the furthestnode point x(s⋆) from the x(0) is determined.

6


Then, the Newton algorithm is run

(q(s⋆), x(s⋆)) = Newton(q(0), x(s⋆), f(q)).

where f(q) = ||q − q(0)||. The follow-ing computations are de ined recur-sively. Let us assume that a sequenceS = ((q(si), x(si)), si < si+1) of pairs nodecon iguration-node points has been alreadydetermined. After the irst stage we have S =((q(0), x(0)), (q(s⋆), x(s⋆)), (q(smax), x(smax))).Note that the sequence is cyclic (i.e. the lastpair in the sequence is the same as the irstone; the only difference is that the argument isequal to smax instead of 0). In each (or some)of the sub-intervals [si, si+1] the furthest nodex(s⋆) from the line x(si)x(si+1) is selected.The Newton algorithm is run

(q(s⋆), x(s⋆)) = Newton(qini, x(s⋆), f(q))

initialized at the con iguration

qini = λq(si) + (1− λ)q(si+1), (8)

withλ =

s⋆ − sisi+1 − si

∈ (0, 1). (9)

qini is theweighted sumof the twoneighboringnode con igurations while the optimized func-tion is given as

f(q) = ||q − q(si)||+ ||q − q(si+1)||. (10)

The resulting pair (q(s⋆), x(s⋆)) is added to Sand the S is sorted to get ∀i si < si+1. Thisidea is illustrated graphically in Fig. 1a.

Scheme 2: let a sequence of node points be given(x(s1), . . . , x(sk)) with 0 < si < si+1 < smax.Simultaneously, a set of i ∈ {1, . . . , k} Newtonalgorithms is run Newton(q(0), x(si), f(q))with the quality function, cf. Fig. 2

f(q) = ||q − q(sji )||+ ||q − q(sji+1)||, (11)

where q(sji ) denotes the con iguration ob-tained in the j-iteration of the Newton algo-rithm run for the i-the node point. Obviously,not all tasks i ∈ {1, . . . , k} will be completedafter the same number of iterations. If the i-thtask has been inished after ji iterations, it isassumed that ∀j > ji : q(s

ji ) = q(sjii ). Graphi-

cally this idea is visualized in Fig. 1b.

The two schemes rely on the same idea: generatea new con iguration to approach the required nodepointwhile increasing the length of trajectory as smallas possible. However, the irst scheme realizes it se-quentially while the second one – in parallel.

The procedure of generation new node con igura-tions is repeated until their number is large enoughto considered the resulting trajectory to be the solu-tion of the repeatable inverse kinematic task. To this

a

1

2

2

3

33

3

k(q )0

b

k(q )0

Fig. 1. a – genera on of node points with Scheme 1,(numbers correspond to a phase of adding the nodes),b – genera on of node points with Scheme 2

k

k

k

x(s )

x(s )

q(s ) q(s )

m

r

r

l

x(s )l

q(s )m

q (s )0

m

Fig. 2. Evolu on of configura ons from q0(sm) to q(sm)to reach the point k(q(sm)) = x(sm)

aim let us de ine for a set of node con igurations SQ ={(q(si)), = 1, . . . ,K}, extracted from the sequence Sthe distance d

d(k(SQ), x(·)) =maxi=1,K−1 ||k

(q(si)+q(si+1)

2

)− x

(si+si+1

2

)||.(12)

If the condition

d(k(SQ), x(·)) ≤ δ, (13)

(where δ denotes an assumed acceptable error) is sat-is ied, the loop in the con iguration space has beenfound (with the linear interpolation of consecutivenode con igurations as an output). Otherwise in thosesegments where condition (13) was violated, compu-tation progresses according to a selected scheme.

One more aspect of the algorithm (6) with opti-mization in the null space of the Jacobianmatrix (ξ2 =

7


0) should be considered because this algorithm is runmany times and it greatly impact overall computa-tional complexity.

The simplest choice of coef icients ξ1, ξ2 is to settheir values small and constant. However, probablythis is not the best possible choice. To support thisclaim let us notice that1) computational scheme (6) is valid only for a small

con iguration increase∆qi = ξ1∆q1i + ξ2∆q2i , be-cause Eq. (6) is based on linearization of kinemat-ics around a current con iguration

2) the nature of components∆q1i = J#(qi)(xf − k(qi)) (14)

and

∆q2i =(I − J#(qi)J(qi)

) ∂f∂q

|q=qi (15)

are quite different although both generate val-ues in Rn. They values may strongly depend on astage of computations. For example the irst com-ponent (14) takes smaller and smaller values asa point corresponding to a current con igurationgets closer and closer to the goal point in thetaskspace. The irst component (14) is responsiblefor convergence of the algorithm (6) so it shouldhave got higher priority than the other (15), re-sponsible for optimization of f(q).Therefore in practical implementation of the New-

tonalgorithm(6)wepropose the followingprocedure:let us assume that the length of admissible changeof con iguration in a single iteration is equal to ∆. Ina current con iguration qi, the components ∆q1i ,∆q2iare computed. Then, the values of ξ1, ξ2 are set to get||∆qi|| = ∆. Finally, one dimensional optimizationprocess is invoked with the quality function f(qi +ξ1∆q1i + ξ2∆q2i ) and the independent variable ξ2.ξ1(ξ2) is computed from

||ξ1∆q1i + ξ2∆q2i || = ∆ (16)while the convergence condition||k(qi+ ξ1∆q1i + ξ2∆q2i )−xf || < ||k(qi)−xf ||. (17)must hold. If the condition (17) is not satis ied (i.e.there does not exist admissible ξ2), the value of ∆should be decreased and once again the procedure isrun. Obviously, in a close vicinity of the goal point thevalue of∆ is decreased obligatory,∆ = ||k(qi)− xf ||.

Because forward kinematics k(q) as well as thequality function f(q) are continuous functions of q,the aforementioned optimization process is conver-gent outside singular con igurations (where the ma-trix J# becomes ill-posed), and the convergence is rel-atively fast because in one dimensional optimizationonly computationally cheap kinematics is invoked.

It is worth noticing that in a close vicinity of sin-gular con igurations a robust inverse pseudo-inverseshould be used. Temporarily, J# is replaced there byJT (JJT + γI)−1 with a small parameter γ. Conse-quently, the loop in the con iguration space can begenerated even in this case.

3. Simula onsAn illustration of the elastic band method to re-

peatable inverse kinematic task was performed ontwo robots. The irst model is the 3-dof planar pendu-lum visualized in Fig. 3 with its kinematics given by[

xy

]=

[a1c1 + a2c12 + a3c123a1s1 + a2s12 + a3s123

], (18)

where ai denote lengths ofmanipulator’s links. A stan-dard robotic convention is exploited to abbreviatetrigonometric functions, c12 = cos(q1 + q2), s123 =sin(q1 + q2 + q3). In all simulations all lengths of linksof the planar pendulum were assumed equal to 1.

q

q

q2

3

1

x

y

a

a

1

a2 3

Fig. 3. Kinema c structure of the 3-dof planar pendulum

As the second model, the industrial robot PUMAwas chosen (Fig. 4). PUMA’s positional kinematics(x, y, z) is given by [6] x

yz

=

c1A− s1(d2 + d6s4s5)s1A+ c1(d2 + d6s4s5)

−a2s2 + d4c23 + d6(c5c23 − c4s5s23)

(19)

where A = a2c2 + d4s23 + d6(c4s5c23 + c5s23) andgeometric parameters are equal to a2 = 0.432m, d2 =0.0745m, d4 = 0.432m, d6 = 0.056m.

Fig. 4. Kinema c structure of the PUMA robot

In order to compare the performance of bothschemes two closed-loops in the taskspace weredesigned (cf. Fig. 5). The irst path (Path 1)

8


(x1(s), y1(s), z1(s)) is a circlex1(s) = xc +R cos(2πs),

y1(s) = yc +R sin(2πs),

z1(s) = zc +R cos(2πs), s ∈ [0, 1].

(20)

The second path (Path 2) (x2(s), y2(s), z2(s)) isa Lissajous-like curve

x2(s) = xc +R sin(2πs),

y2(s) = yc +R sin(πs)− 14R cos(8πs)

z2(s) = zc +R cos(2πs), s ∈ [0, 1]

(21)

For both loops the radius is equal to R = 0.9 and thecenter point is placed at (xc, yc, zc) = (1, 1, 0). Forthe 3-dof pendulum the third component of the vector(x, y, z) is neglected everywhere.

a

0.5

1.5x

0.5

1.5y

0.5

z

b

0.5

1.5

x

1.5y

0.5

z

Fig. 5. a – Path 1 – the circle, b – Lissajous-like Path 2

An admissible error to reach each consecutive goalpoint in the Newton algorithm was set to 0.001, whilethe maximal change of con iguration in a single itera-tionwas equal to∆ = 0.5o. Theparameter ξ1 of theba-sic Newton algorithm (6)was derived from∆, i.e. ξ1 =∆q1i /∆, cf. Eq. (14). When the null-space optimizationwas preformed (cf. Eqns. (16-17)), scales ξ1, ξ2 of thelengths ∆q1i , ∆q2i , Eqns. (14-15), were determined inone dimensional optimization process, preserving theinal length of con iguration change equal to∆.

Two versions of the elastic bandmethodwere run.For Path 1 and Path 2, initial con igurations were se-lected as follows q1(0) = (−18.96◦, 37.93◦, 70.54◦)T

a

0.2 0.4 0.6 0.8 1.0s@-D

50

100

150q@°D

q1

q2

q3

b

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

q@°D

q1

q2q3

c

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

q@°D

q1

q2 q3

Fig. 6. Trajectories of the 3-dof pendulum generated forPath 1 with: a) the pseudo-inverse Jacobian method, b)the elas c band method (Scheme 1), c) the elas c bandmethod (Scheme 2)

and q2(0) = (−26.05◦, 58.80◦, 104.92◦)T , respectively,which corresponds to the initial point of the cyclicpath (x(0), y(0))T1,2. The results for the 3-dof pendu-lum are visualized in Figs. 6-7(b-c) while the strobo-scopic view of several postures of the manipulator isdepicted in Fig. 8.

Tab. 1. Numerical results for the 3-dof pendulum

Path 1method lenght [o] num. of pts. time [s]pseudo-inverse 253.05 201 0.10Scheme 1 241.41 201 2.10Scheme 2 241.08 257 0.50


9


a

0.2 0.4 0.6 0.8 1.0s@-D

-20

20

40

60

80

100

q@°D

q1

q2

q3

b

0.2 0.4 0.6 0.8 1.0s@-D

-20

20

40

60

80

100

q@°D

q1

q2

q3

c

0.2 0.4 0.6 0.8 1.0s@-D

-20

20

40

60

80

100

q@°D

q1

q2

q3

Fig. 7. Trajectories of the 3-dof pendulum generated forPath 2 with: a) the pseudo-inverse Jacobian method, b)the elas c band method (Scheme 1), c) the elas c bandmethod (Scheme 2)

To compare ef iciency of the elastic band methodwith the Jacobian pseudo-inverse method (i.e. the al-gorithm (6) with ξ2 = 0) some more tests were car-ried out. The irst task of the pseudo-inverse Jaco-bian method was run with the initial con igurationq0 and the goal xf = x(∆s) where ∆s = 0.005.Consecutive goals were selected as xf = x(i∆s),i = 2, . . . , smax/∆s and the node con iguration fromthe (i − 1)-st task became an initial one for the i-thtask. The pseudo-inverse algorithm does not guaran-tee repeatability. Final con igurations for Path 1 andPath 2 q1(1) = (−6.03◦, 12.30◦, 88.81◦)T , q2(1) =(−24.32◦, 56.56◦, 106.96◦)T , respectively, did not meetthe initial con igurations q0.

The length of resulting trajectories, computed as∑K−1k=0 ||q(sk+1)− q(sk)||whereK is the inal number

of node con igurations, number of iterations to com-plete the algorithms and the elapsed time (algorithmswere implemented in the Mathematica package andrun on 2 × 3.3 GHz computer) for Scheme 1 and 2,

a

Initial conf.

Final conf.

-1 2 3x

-1

1

2

3y

b

Initial conf.

Final conf.

-1 2 3x

-1

1

2

3y

c

Initial conf.

Final conf.

-1 2 3x

-1

1

2

3y

Fig. 8. Stroboscopic view of the 3-dof pendulum follow-ing Path 2with: a) the pseudo-inverse Jacobianmethod,b) the elas c band method (Scheme 1), c) the elas cband method (Scheme 2)

10


aswell as for the pseudo-inverse Jacobianmethod, aregathered in Table 1.

The results seem to be rather unexpected as moreconstrained method (the repeatable inverse) gener-ated shorter (or almost the same length) trajecto-ries as unconstrainedmethod (pseudo-inverse). To ex-plain this phenomenon let us notice that the pseudo-inverse method optimizes a change of con igurationlocally (from one iteration to another) while the elas-tic band method exploits global distribution of con-secutive node points. For this particular robot andPath 1, it appears that the global knowledge in con-strained task impacts the inal trajectory more thana local optimization in unconstrained task. It can beeasily noticed, cf. Table 1) that the pseudo-inversemethod without optimization within the null spaceof the Jacobian matrix is the fastest among methodstested and Scheme 2 generated results much fasterthan Scheme 1.

Finally, two schemes were tested on thePUMA robot using cyclic Path 1 and Path 2. Onceagain, a comparison with the pseudo-inverseJacobian method was preformed. For Path 1,the initial con iguration was chosen as q(0) =(−0.84◦, 62.40◦,−67.05◦, 46.58◦, 44.06◦, 45◦)T whilethe inal con iguration was computed as q(1) =(9.29◦, 66.17◦,−64.37◦,−24.97◦, 15.93◦, 45◦)T . Thepseudo-inverse method run for Path 2 also didnot satisfy the repeatability condition as q(0) =(−11.25◦, 57.87◦,−78.52◦, 41.93◦, 34.17◦, 45◦)T andq(1) = (−6.67◦, 61.80◦,−77.17◦, 43.16◦, 21.58◦, 45◦)T .Numerical characteristics were gathered in Tab. 2while the computed trajectories were plotted inFigs. 9-10. As expected, the pseudo-inverse methoddid not generate a closed-loop trajectory in thecon iguration space. When an elastic band methodwas applied, for both of the schemes, the resultingtrajectory remained cyclic. The ordering of the testedmethods, with respect to the elapsed time to completeprescribed tasks, observed for the model of 3-dofpendulum remains valid also for the PUMA robot. Onecan notice that the length of trajectories for the 3-dofplanar pendulum is almost the same for both initialpaths given in R2 while lengths of trajectories forthe PUMA robot differ signi icantly where paths aregiven in R3. This observation should attract attentionof path designers to appropriately locate the pathwithin a taskspace (usually the shape of the path isdetermined by technological requirements, but itslocation not necessary).

4. ConclusionsIn this paper an adaptation of the elastic band

method to repeatable inverse kinematic task was pre-sented. Advantages of the proposed approach include:an application of known and simple methods (theNewton algorithm with an optimization in the nullspace of the Jacobian matrix) simplicity of kinematiccomputations which allow to implement this algo-rithm also in on-line regime, admittance of many vari-ants of changing the current cyclic trajectory as well

a

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

150

q@°D

q1

q2

q3

q4

q5 q6

b

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

150

q@°D

q1

q2

q3

q4

q5q6

c

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

150

q@°D

q1

q2

q3

q4

q5q6

Fig. 9. Trajectories of the PUMA robot generated forPath 1 with: a) the Jacobian pseudo-inverse method, b)the elas c band method (Scheme 1), c) the elas c bandmethod (Scheme 2)

Tab. 2. Numerical results for the PUMA robot



as optimization at many stages of the algorithm (theNewton algorithm, selection the number and distribu-tion of node points). This algorithm admits also exten-sion inoptimizing the initial node con iguration (in thecurrent version is was assumed as a given initial data).One more way to extend the proposed algorithm is toadapt it to obstacle-cluttered environements, possibly,

11


a

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

q@°D

q1

q2

q3

q4 q5 q6

b

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

q@°D

q1

q2

q3

q4q5

q6

c

0.2 0.4 0.6 0.8 1.0s@-D

-50

50

100

q@°D

q1

q2

q3

q4q5

q6

Fig. 10. Trajectories of the PUMA robot generated forPath 2 with: a) the Jacobian pseudo-inverse method, b)the elas c band method (Scheme 1), c) the elas c bandmethod (Scheme 2)

via rede ining quality function f to punish approach-ing to obstacles.

AUTHORSIgnacy Duleba∗ – Institute of Computer Engineering,Control and Robotics Wroclaw University of Technol-ogy, 50–372 Wroclaw, Janiszewski St. 11/17, e-mail:[email protected].

Michal Opalka – Institute of Computer Engineering,Control and Robotics Wroclaw University of Technol-ogy, 50–372 Wroclaw, Janiszewski St. 11/17, e-mail:[email protected].∗Corresponding author

REFERENCES[1] Chiaverini S., Oriolo G., Walker I.D., Hand-

book of Robotics, chapter Kinematically redun-dantmanipulators, Springer Verlag, Berlin, 2008,245–268.

[2] Duleba I.,Methods andalgorithms ofmotion plan-ning for manipulators and mobile robots, EXIT,Academic Science Publisher, Warsaw, 2001, inPolish.

[3] Duleba I, Opałka M., Using the elastic bandmethod to repeatable inverse kinematic algo-rithm for manipulators, Science Works, WarsawUniv. of Technology, Electronics series, 2012, vol.182, 2012, 351–356, in Polish.

[4] Karpinska J., Approximation of algorithms forrobotmotion planning. PhD thesis, WroclawUni-versity of Technology, 2012, in Polish.

[5] Klein C.A., Chu-Jeng C., Ahmed S., “A new for-mulation of the extended Jacobian method andits use in mapping algorithmic singularities forkinematically redundant manipulators”, IEEETrans. Robot. Autom., vol. 11, 1995, 50–55.

[6] Lee C., Robot arm kinematics, dynamics, and con-trol, Computer, vol. 15 (12), 1982, 62–80.

[7] Nakamura Y., Advanced Robotics: Redundancyand Optimization. Addison Wesley, New York,1991.

[8] Quinlan S., Khatib O., “Elastic bands: Connectingpath and control”, IEEE Int. Conf. on Robotics andAutomation, vol. 2, 1993, 802–807.

[9] Richter S., DeCarlo R., “Continuation methods:Theory and applications”, IEEE Tran. on Auto-matic Control, vol. 28, no. 6, 1983, 660–665.

[10] Roberts R., Maciejewski A.A., “Repeatable gen-eralized inverse control strategies for kinemati-cally redundantmanipulators”, IEEE Tran. on Au-tomatic Control, vol. 38, 1993, 689–699.

[11] SpongM., VidyasagarM., Introduction to robotics.Robot Dynamics and Control. MIT Press, Cam-bridge, 1989.

[12] Tchon K., Duleba I., “De inition of a kinematicmetric for robot manipulators”, Journal ofRobotic Systems, vol. 11, no. 3, 1994, 211–221.

12


Articles 13

Impedance Controllers for Electric-Driven Robots

Edward Jezierski, Artur Gmerek

Submitted: 15th January 2013; accepted: 3rd June 2013

Impedance Controllers for Electric-Driven Robots

Submitted: 15th January 2013; accepted: 3rd June 2013

Edward Jezierski, Artur Gmerek

Abstract:

This article presents a proposal of impedance controller, which is able to infulence the compliance of a robot, based on information obtained from a low-level controller about robot interaction with the environment. The proposed system consists of a low-level controller with proximity sensor, based on which mechanical impedance is adjusted. The intention of the creators was to develop a universal impedance regulator in the sense of hardware and software layers. Because of TCP/IP architecture, the designed regulator can be easily adapted to different robot controllers. Experiments were conducted on a real 1-DOF manipulator driven by BLDC motor, as well as using simulation on a 2DOF planar robot.

Keywords: impedance controller, admittance contro-ller, 2DOF manipulator, BLDC drive

1. Introduction

Nowadays, in an industrial environment, we often deal with positional robot control. This type of con-trol takes place when the manipulator is highly rigid, thus the kinematic chain is characterized by high impedance. However sometimes is needed to soften a kinematic chain. In this situation impedance con-trol can be used. A robot’s mechanical impedance is usually equated with the impedance of its mechani-cal part. The concept of impedance control was first introduced to the field of robotics by Hogan [3]. If we assume that we are utilizing the analogy of velocity-current [5, 6], then mechanical impedance can be defined with a greater precision. Namely, after the adaption of an incremental linear dynamical model in the vicinity of a chosen operating point, the me-chanical impedance of the kinematic chain is defined as the Laplace transform of the force exerted by the end-effector of the manipulator to the Laplace trans-form of effectors’ velocity.

Proper control of a robot’s actuators can change the impedance of its kinematic chain. It is relatively easily manageable in the case of pneumatic or hy-draulic actuators [2]. Their rigidity is naturally relat-ed to energy-transferring media. In the case of elec-tric drives, it should be utilized to control kinematic chain stiffening or softening with the help of feed-back mechanisms. In this process, it is necessary to anticipate how the manipulator will interact with the environment, as changes in impedance cannot be infinitely fast. This is due to the fact that electric actuators are stiff according to the configuration of drives (usually high velocity and high gear ratio).

In most cases interaction with the environment must be detected early enough to enable the controller to respond effectively. This can be achieved by gather-ing information about the environment, obtained from the robot’s external sensors.

The article proposes a system that makes it easy and efficient – to connect the manipulator controller with a proximity sensor. It consists of a low-level driver which works with the designed computer system. The data collected by the sensor is sent to the impedance controller software which processes the data and controls an electric drive associated with controller in order to obtain desired flexibility. The manipulator can thus be prepared to interact with an external object to avoid harmful collisions. Experiments were done on robot with one degree of freedom (DOF), as well as using simulation on a 2DOF planar robot.

Impedance control is currently under develop-ment in a growing number of research centers. Both stiffness and damping adjusters can be viewed as primitive impedance drivers. There are only a few studies describe how to interact with the control system with the use of external sensors in order to optimize manipulator impedance. Interaction with the environment is mostly affected by the use of force sensors [1, 7]. However, force sensors are ef-fective only when the manipulator’s movements are sufficiently slow. If manipulator velocity is high, potential collisions with a rigid environment should be detected in advance. This can be accomplished with the help of proximity sensors, radars or vision systems. An example of this type of solution is de-scribed in [9]. The authors used information from the vision system to impact manipulator impedance. Impedance was switched abruptly when the robot approached an object.

2. Impedance Controller

The manipulator’s desirable properties are de-termined by the environment and the features of the robot controller. Usually it is assumed that if the robot is close to a person or detects an undesirable object on its planned trajectory, the susceptibility of the kinematic chain should increase. When the sen-sor detects an approaching object, the controller changes the manipulator’s impedance, depending on the distances between the robot’s arm and an object.

The system presented in this article consists of a brushless DC motor (BLDC) which drives the robot link, an inductive proximity sensor with a low-level controller and a computer system that processes the data and controls the motor. The computer system


Articles14

consists of two programs. The first captures and processes data from the low-level driver, while the

second controls the motor (Fig. 1).

Fig. 1. The overall block diagram of the control system for BLDC motor

2.1 The Impedance Driver

Description of the dynamics of the manipulator in contact with the environment can be described as follows:

�(�)�� + �(�, �� )�� + �(�) + �(�, �� ) � � � �� (1)

where � is an inertia matrix, � – a matrix of velocity couplings, � – a vector of gravitational forces, � – a friction vector, � – a vector of control torques and forces, ��– a vector of external forces. The vector of external forces acting on the manipulator can be approximated by means of the mass-spring-damper system, it gives: �� (�� ) + ��(�� ) + ��(� � ��) (2)

where � – is the environment’s inertia, �� – is damping and �� is stiffness. Transforming equation (2) to determine join acceleration we get:

�� (�� + ��(�� ) +��(� � ��)) (3)

substitutting (3) to (1) we get:

�(�)(�� (�� + ��(�� ) + ��(� � ��))) + �(�, �� )�� + ℎ(�) + �(�, �� ) + �� (�)

In the case of a 1DOF manipulator, the last equa-tion can be written as:

(�� + �

� ��)(�� (�� + ��(�� ) +��(� � ��))) + �

� ��(�) + �(�, �� ) + �� (5)

The equation can be used to control the robot by the means of “the computed torque feedforward control”. Equation (5) shows that the response of a conventional driver’s impedance depends on the value of acceleration, velocity and position, as well as the value of the force controller, set controller pa-rameters (�, �� and ��) and the object’s dynamic parameters (�, �, �, �).

However, the equation fulfills its function only when the manipulator actively interacts with the environment. Due to the fact that the force can be recorded only when the manipulator establishes contact with the environment it was decided to use virtual force instead of actual force. The equation for the modified force can be expressed as follows:

�� (�)

where a is an estimated position based on infor-mation obtained from the position sensor, and � ��m� is the proportionality parameter (obtained by heuristic methods).

The equation shows that the reaction of the ma-

nipulator does not depend on torque value but is associated with the distance from the object to the manipulator. The proportionality factor has been selected to provide an appropriate response when approaching the outer arm to the object.






�(�)�� + �(�, �� )�� + �(�) + �(�, �� ) � � � �� (1)



�� (�� + ��(�� ) +��(� � ��)) (3)


�(�)(�� (�� + ��(�� ) + ��(� � ��))) + �(�, �� )�� + ℎ(�) + �(�, �� ) + �� (�)


(�� + �

� ��)(�� (�� + ��(�� ) +��(� � ��))) + �

� ��(�) + �(�, �� ) + �� (5)



�� (�)









�(�)�� + �(�, �� )�� + �(�) + �(�, �� ) � � � �� (1)



�� (�� + ��(�� ) +��(� � ��)) (3)


�(�)(�� (�� + ��(�� ) + ��(� � ��))) + �(�, �� )�� + ℎ(�) + �(�, �� ) + �� (�)


(�� + �

� ��)(�� (�� + ��(�� ) +��(� � ��))) + �

� ��(�) + �(�, �� ) + �� (5)



�� (�)





Articles 15

Fig. 2. The example of communication between server, written in Python and compiled C-client

3. Implementation of Hardware and Software The aim of the project was to design a system

that will enable efficient implementation of imped-ance control with the help of different types of phys-ical hardware. Problems with the integration of vari-ous systems occur frequently. In the case being de-scribed, a BLDC electric motor was controlled in real time using the Python interpreter, which cooperates with a data acquisition card, as well as drivers which are dedicated to work with C compiled programs or other compilers. It was therefore necessary to devel-op software which would ensure proper operation of the controller.

The problem of software integration occurs very often when creating systems of greater complexity. There are several methods to combine various pro-grams to form a common system on platforms such

as Windows or Linux. One of the common ways is to implement multithreading functionality, what means running several routines at the same time and ex-changing data via, for example, a swap file. A common resource in this case may be in a computer’s internal memory or on a hard disk drive. However, this type of cooperation between two programs makes it difficult to implement it cor-rectly on the various operating systems. In addition, one of the project’s assumptions was that computer programs are easily integratable.

Due to the above-mentioned issues, a technique of combining two programs using TCP/IP was used. One of the programs acts as a server and the second as client. Both server and client can swap data. It was decided that the server is a program written in Py-thon. It controls the motor and enables connection with client. The client is written in C++ and is re-sponsible for processing data from the proximity sensor.

Experiments were performed using several types of acquisition cards in order to demonstrate the versatility of the solution. We have successfully used cards manufactured, inter alia, by Advantech (USB-4704, PCI-1710), as well as drivers based on

the popular ATmega8 Atmel microcontrollers. Driv-ers communicated with the computer via a USB-UART protocol. 4. Experiments and conclusions 4.1 Experiments on Object with One Degree

of Freedom One of the final experiments was carried out on

the model with one degree of freedom (Fig. 3). In order to determine the optimal control pa-

rameters several experiments were performed to illustrate how changes in stiffness and damping coef-ficients affect the system’s properties. When the manipulator stiffness decreases, it no longer tracks the desired trajectory, as opposed to situations when rigidity is high. This principle was used to determine the values of stiffness and damping parameters that cause a desired change in the robot’s operation.

Experiments consisted of altering the value of stiff-ness and damping parameters when motor was at a steady-state �� .

Fig. 3. The laboratory position with the BLDC motor











rameters several experiments were performed to illustrate how changes in stiffness and damping coef-ficients affect the system’s properties. When manipu-lator stiffness decreases, it no longer tracks the de-sired trajectory, as opposed to situations when rigid-ity is high. This principle was used to determine the values of stiffness and damping parameters that cause a desired change in the robot’s operation.













rameters several experiments were performed to illustrate how changes in stiffness and damping coef-ficients affect the system’s properties. When manipu-lator stiffness decreases, it no longer tracks the de-sired trajectory, as opposed to situations when rigid-ity is high. This principle was used to determine the values of stiffness and damping parameters that cause a desired change in the robot’s operation.




Articles16

Fig. 4. Graph of desired virtual speed of the manipula-tor when changing the damping coefficient. The virtu-al speed was 300 RPM. During the experiments the motor was rotating

Fig. 5. Velocity of the manipulator with discrete changes of virtual forces (the distance of object from the proximity sensor) and changes in controller’s damping parameter. The virtual velocity was 300 RPM

One of conducted experiments was to suddenly

change the force which acts on the manipulator; meaning the distance between the object and the proximity sensor, while also suddenly changing at-tenuation values (Fig. 5). From the graph below it can be concluded that a high damping value makes

the system so rigid that it does not respond to the operating force. If the damping is small, the driver reacts causing even a change in the direction of rota-tion. In some situation it is the manipulator’s most desirable response.

The second experiment was to increase the ma-nipulator’s impedance when the object is closer to the robot. The graph shows that when the value of the virtual force is in the range of 0.18 to 0.19 Nm (the time interval from 45 to 47.5 s), the motor stops. In some cases this is the most optimal re-sponse. After crossing this threshold, the motor be-gins to slowly rotate in the opposite direction (Fig. 6). If the object moved away from the motor, velocity would return to its original value.

Fig. 6. Changes in velocity when approaching the ob-ject to the sensor when distance from the object to the sensor is less than 0.5 cm, the motor direction of rota-tion changes

4.2 Simulation Study of Cartesian Impedance

Controller Second experiment was conducted in Cartesian

space in order to test what the properties of control system are, when impedance controller processes task variables. These parts of experiments were done on 2 DOF planar manipulator in Matlab-Simulink environment. Manipulator was modeled in second edition of Sim-Mechanics toolbox (Fig. 7).

Fig. 7. Manipulator with 2 degrees of freedom modeled in Sim-Mechanics environment

Fig. 4. Graph of desired virtual speed of manipulator when changing the damping coefficient. The virtual speed was 300 RPM. During the experiments the mo-tor was rotating


































Articles 17

The dynamic model of the manipulator was sim-plified to mass points placed in the ends of rigid links. This can be done without limitation to the generality of the results achieved. The dynamics of a manipulator can be written as:

�� + ��(�� + �� + ��) ��(�� + ��)��(�� + ��) ��

� ��

� + �� + ��

� +

�(�� + ��)�� + ��cos (�� + ��)�� cos(�� + ��) � + ��

�� = ��

�� (�)

Forward kinematics of the manipulator is at the form:

�� = �� + �� cos(θ� + θ�) (8)

�� = �� + �� sin(θ� + θ�)

while inverse kinematics is given by:

�� = �� + ��

��

�� = ��

�� + ��

�� (9)

From (8) Jacobian matrix of the manipulator can be obtained:

�(�) =��

��

��

��

��

=

�� sin (�� + ��) ��sin (�� + ��)

�� + ��cos (�� + ��) ��cos (�� + ��) � (��)

and finally an inverse Jacobian is of the form:

��(�) = �

�� cos (�� + ��) ��sin (�� + ��)�� cos (�� + ��) �� sin (�� + ��)� (��)

The other subsystems of control were modeled in standard Simulink environment. The view of the whole system is presented in Figs. 8 and 9.

Fig. 8. The overall scheme of Simulink simulation

The dynamic model of the manipulator was sim-plified to mass points placed in the ends of rigid links. This can be done without limitation to the generality of the results achieved. The dynamics of a manipulator can be written as:

�� + ��(�� + �� + ��) ��(�� + ��)��(�� + ��) ��

� ��

� + �� + ��

� +

�(�� + ��)�� + ��cos (�� + ��)�� cos(�� + ��) � + ��

�� = ��

�� (�)

Forward kinematics of the manipulator is at the form:

�� = �� + �� cos(θ� + θ�) (8)

�� = �� + �� sin(θ� + θ�)

while inverse kinematics is given by:

�� = �� + ��

��

�� = ��

�� + ��

�� (9)

From (8) Jacobian matrix of the manipulator can be obtained:

�(�) =��

��

��

��

��

=

�� sin (�� + ��) ��sin (�� + ��)

�� + ��cos (�� + ��) ��cos (�� + ��) � (��)

and finally an inverse Jacobian is of the form:

��(�) = �

�� cos (�� + ��) ��sin (�� + ��)�� cos (�� + ��) �� sin (�� + ��)� (��)

The other subsystems of control were modeled in standard Simulink environment. The view of the whole system is presented in Figs. 8 and 9.

Fig. 8. The overall scheme of Simulink simulation


Articles18

Fig. 9. The view of decoupling model subsystem (model of manipulator control by the designated torque method). Model was decoupled with PD regulator

Fig. 10. Simulink scheme of admittance controller

It can be seen from Fig. 8, that the impedance

controller was applied in Y-coordinate in manipula-tor control system, while in the X-axis a standard PD regulator was used.

Experiments were designed to see how the ma-nipulator behaves with different impedance control-ler settings. Simulation consisted of controlling the 2DOF robot on the desired trajectory. During 70 seconds of simulation high external force was applied in Y-axis. This situation can be likened to a collision with an obstacle (Fig. 11).

Fig. 11. Schematic representation of 2DOF manipula-tor and force acting in Y axis




















Articles 19

The results of research were presented on below graphs. Three runs show the trajectory tracking in case of rigid manipulator, when impedance was lower and very low impedance of the manipulator.

In the last case, as it was presented in previous ex-periments, the manipulator reacts by subjecting the force. Disturbances that occurred at the end of the workspace are associated with non-stationary of boundary conditions.

Fig. 12. The illustration of changing the impedance of the manipulator. The figure from the left presents the de-creased value of impedance (left figure – very rigid manipulator, middle figure – impedance of controller slightly reduced, right figure – very low value of impedance)











Articles20

5. Future Research

During experimentation it was discovered that choosing the most adequate parameters for the im-pedance controller is a complicated process. A set of parameters is closely related to the numerical meth-ods used to solve differential equations, with a lim-ited speed of calculation, as well as the selection of the scaling factors for information obtained from the position sensors. This is probably one of the main reasons why impedance/admittance controllers are rarely applied in practice.

Upcoming research will therefore be associated with the creation of an adaptive model, probably based on the methods of artificial intelligence, by means of which it will be possible to effectively se-lect appropriate controller settings, depending on the desired behavior of the robot. The next sched-uled project will also involve the integration of an impedance controller to control robots with higher degrees of freedom.

Acknowledgements

This work is financially supported by the Minis-try of Science and Higher Education of Poland (Grant No. N N514 469339).

Note: This is an expanded version of the paper presented during The Polish National Conference on Robotics, 2012.

AUTHORS Edward Jezierski*, Artur Gmerek Institute of Automatic Control, Lodz University of Technology, email: [email protected] tel. 0-42 631-25-40 *Corresponding author References [1] Caccavale F., Siciliano B., “Six-DOF impedance

control based on angle/axis representations”, IEEE Transactions on Robotics and Automation, vol. 15, no. 2, 1999, 289–300.

[2] Granosik G., Jezierski E., Kaczmarski M., “Model-ling, simulation and control of pneumatic jump-ing robot”. In: Proc. European Robotics Sympo-sium 2008. Springer Tracts in Advanced Robotics, Berlin: Springer 2008, 155–164.

[3] Hogan N., “Impedance control: An approach to manipulation”, Parts I, II, III, ASME Journal of Dynamic Systems, Measurements, and Control, 1985, vol. 107, 1–23.

[4] Jezierski E., „Od sterowania pozycyjnego robo-tów do sterowania impedancyjnego”. In: Tchoń K. (ed.) Postępy robotyki – sterowanie, percepcja, komunikacja, Warszawa: WKiŁ, 2006, 13–36. (In Polish)

[5] Jezierski E., “On electrical analogues of me-chanical systems and their using in analysis of

robot dynamics”. In: Kozłowski K.R. (ed.), Robot Motion and Control – Recent Developments, Ber-lin: Springer, Berlin 2006, 391–404.

[6] Jezierski E., Dynamika robotów, WNT, Warsza-wa, 2006. (In Polish)

[7] Ott C., Albu-Schäffer A., Kugi A., Hirzinger G., “On the passivity-based impedance control of flexible joint robots”, IEEE Transactions on Ro-botics, vol. 24, no. 2, 2008, 416–429.

[8] Siciliano B., Sciavicco L., Villani L., Oriolo G., Robotics – modelling, planning and control, Springer, 2009.

[9] Twuji T., Akamatsu H., Hatagi M. and Kaneko M., “Vision-Based Impedance Control for Robot Manipulators”. In: IEEE/ASME International Conference on Advanced Intelligent Mechatro-nics, 1997.

[10] Winiarski T., Zieliński C., „Podstawy sterowania siłowego w robotach”, PAR, no. 6, 2008, 5–10. (in Polish)

[11] Jezierski E., Gmerek A., ”Admittance control of a 1-DoF robotic arm actuated by BLDC motor”. In: 17th International Conference on Methods and Models in Automation and Robotics (MMAR), 2012, 633–638.


S S D OI C MS S D OI C MS S D OI C MS S D OI C M


Tomasz Winiarski, Konrad Banachowicz, Maciej Stefańczyk

DOI: 10.14313/JAMRIS_4-2013/21

Abstract:Modern robo c systems are able to localize doors andits handle or knob, grasp the handle and open the door.Service robots need to open doors and drawers to au-tonomously operate in human environment. The me-chanical proper es of doors lead to incorpora on of forceand velocity constraints into the control law, to avoid theenvironment damage. In the ar cle the impedance con-trol law was expanded with these factors to achieve safebehavior of direct (explicit) posi on-force controller ofKUKA LWR4+ robot that opens the door.

Keywords: manipula on, impedance control, servicerobots

1. Introduc onRecent years have resulted in a number of robotic

manipulator structures [4, 9] mounted on mobilebases. Themain scienti icmotivation for their creationwas to build a hardware and software base for ser-vice robots operating in human environment. The re-search covers many issues related to human’s every-day existence, such as working in the kitchen [11] ormovement between rooms [7]. Both of these tasks re-quire opening doors. Depending on the needs it can bea little door in the kitchen cabinet as well as a massiveroom door.

One of the irstworks in thismatter considers dooropening with a four degrees of freedom robotic ma-nipulator [8]. The control law assumed that the ma-nipulator exerts the desired general force set in itsend–effector and system measures the position in thesame coordinate system simultaneously. Taking intoaccount the real trajectory of the end–effector, the con-trol law parameters were chosen to generate the de-sired force that opens the door. Modern applications[4, 7, 9, 11] are much more comprehensive and takeinto account several aspects e.g.: location of the door,location of the handle or knob, grip generation andgrip execution.

It is worth noting that the inal outcome of theexperiments presented so far related to door open-ing is comparable for various structures of the con-trollers. The work [4] presents robotic system that iscontrolled indirectly and general force readings camenot from a six-axis sensor mounted in the manipula-tor wrist [15] but from sensors located in the grip-per ingers phalanges. The control law applied in thedirection of door opening was constructed by super-position of reference velocity and force error depen-

dent correction velocity. Similarly, the robot describedin [11] had indirect control structure, but in this casethe force feelingwas the fusion ofmeasurements froma six-axis transducer placed in a manipulator wristand force data from two inger gripper phalanges. PR2Robot [7] control system does not measure the gen-eral force in manipulator wrist, but, on the basis ofvelocity command speci ied in the end–effector, com-putes the additional desired joints torque componentfor gravitationally balanced manipulator. System de-scribed in [9] utilized one of the irst generations ofdirectly controlled DLR-LWR robot, where the manip-ulator tip contact force computation is based on thetorques measured in joints.

It is impossible to pass in silence over the factthat the opening of doors and drawers is also im-portant in medical research. For example [13] stud-ied opening a drawer by healthy people and patientswith impaired cerebellum. It turns out that patientswith dysfunction of the cerebellum have a differentdrawer opening strategy than healthy patients. Typi-cally, ingers’ clamping force on the handle is propor-tional to the pulling force to avoid handle slip. Sim-ilarly, healthy patients, shortly after the end of themovement is felt, more strongly tighten the handleso as not to lose their grip. Patients with cerebellardysfunction also perform the whole tasks, but clenchtighter in the irst phase and sometimes lose the gripat the end of the motion, when the drawer reachesmechanical limitations. The forces exerted by themanand the speed of movement during similar operationsare well known and tested. It results in the construc-tion of doors and drawers, in particular, the durabilityof handles, hinges etc.

There are software drivers [10,16] for robots per-forming service tasks, as well as methods of image ac-quisition and analysis [6] necessary to detect objects(e.g. handles or doors [3]). The latter task can also uti-lize fused information from depth sensors and colorcameras [14], which can further improve detection re-sults. So far, there was a lack of control law relatingto some of the top-imposed limitations resulting fromthe door construction adapted for use by humans. Thehandle has a limited resistance, so it is important tolimit the force with which the door is pulled. On theother hand, a rotational speed of the door can not bearbitrarily high, because the accumulated kinetic en-ergy may be potentially dangerous in the case of en-countering an obstacle (e.g., human being).

In this paper we present in a formal way a roboticsystem (sections 2, 3), where the control law allows

21


both opening a door with previously unknown kine-matic and dynamic models, taking into account theconstraints derived from safety requirements. More-over, the algorithm allows to open the door with-out prior knowledge of opening direction and with-out a known position of the hinge (the same algorithmopens both the „right” and „left” doors). In the exper-imental stage (section 4) the door and manipulatorend–effectorwere rigidly connected to test the controllaw by exposing the system to the fast-growing strain.The paper inishes with conclusions in section 5.

2. Nota onThe controller description is formal, hence it starts

with the notation. Position of frame Q relative to Ucanbe expressed as either homogeneousmatrixU

QT ora column vector U

Qr. In the latter case, the position inCartesian coordinates is extended by a rotation anglearound the axis represented by directional vector. Therotational angle and the directional vector (after mul-tiplication) are aggregated into three coordinates. Theoperator A transforms this column vector into a ho-mogeneous matrix, andA−1 de ines an inverse trans-formation:

A(UQr) =UQT , A−1(UQT ) = U

Qr (1)

Manipulator con iguration can also be expressedwith a joint position vector q.

The column vector U r =[UvT , UωT

]T of size 6 ×1 represents generalized velocity of frame U movingrelative to frame 0 expressed in U . It consists of linearvelocity v and rotational velocity ω:

U (0U r) =U r (2)

The column vector UF of the same size 6×1 expressesgeneralized force and consists of a 3 element force vec-tor and a torque vector of the same size. In this case,force is applied to the origin of theU coordinate frame,and is expressed in the same frame.

Some other useful transformations UQξF , UQξV ex-

press generalized force or velocity in one coordinateframe relative to the other, ixed to it:

UF = UQξF

QF , U r = UQξV

Qr (3)

In the case when free vectors are used (such asincrements of position, orientation, velocity or force)there exists a need to express them relatively to theframe, with an orientation other than the one inwhichit was formerly expressed. In this case, one can usethe C(U r) notation, in which generalized velocity ofUframe in relation to 0 is expressed in frame C . For thetransformation purpose the matrix ξ∗, [16] is used:

C(U r) = CUξ∗

U r (4)

In the notation below, d means desired value, and mis the measured value (if d orm are placed in the bot-tom right index). Similarly, the bottom right indexwithsquare brackets denotes a coordinate associated with

a vector or a matrix. We employ the convention of us-ing x, y and z to indicate the linear parts and ax, ay, azto indicate the rotational parts. In ig. 2 some selectedcoordinate frames are marked: 0 – base,W – wrist, E– end–effector (task frame).

3. ControllerResearch has been done on the system based on

KUKA LWR4+ robots. To present the system structure,the formal notation presented earlier in [6, 16] wasused in a modi ied form. The graphical representation( ig. 1) was supplemented by a direct de inition of thedata exchangedwithin a single agent (between its con-trol subsystem c, virtual effector e and real effectorE).

The transition functions that generate outputsbased on inputs and internal state are de ined forparticular components and indicate inputs and out-puts for the described component: [S]tx are inputs, [S]tyare outputs, where S is the input/output value and tmeans discrete time index. Other variables (the inter-nal state of the agent in particular) arewrittenwithoutthe additional symbols.

W0 TmWFm

Kj

Bj

τd

qd

EFd

EW TKc

Bc

E(D rd)

t

E

e

c

W0 Tm

J

EFd

EW TKc

Bc

D0 T

W0 Tm

CartesianImpedanceController

TrajectoryGenerator

Fig. 1. The General structure of the designed single-agent controller: E - real effector, e - virtual effector,c - control subsystem. The two components presentedare an impedance controller in task space and a trajec-tory generator for the impedance controller

Thedetailed description of the system is presentedin the following part of the article. In section 3.1 thereal effector E (KUKA LWR4+ arm) is described withits industrial controller adapted toworkwith researchsystems. Later on, section 3.2 characterizes the de-veloped research controller (virtual effector e) imple-mented in Orocos [2] system. This particular frame-work was chosen because of the availability of com-munication interface with KUKA and since its struc-ture is universal (so it can be used in many works, go-ing beyond the scope of this project). Further part ofthe paper (section 3.3) presents a strategy for open-ing a door, especially the developed control law em-bedded in control subsystem c. This is implementedin ROS [10] system, which was chosen due to thesimplicity of the formulation of tasks using scripting

22


Fig. 2. KUKA LWR4+ armwith its coordinate frames andtransforma on between them, while execu ng the taskof opening the door

languages and ready-to-use communication interfacewith Orocos system.3.1. Industrial controller KUKA LWR4+

KUKA LWR4+ is a serial, lightweight (16kg), re-dundant robotic arm with seven degrees of freedom.Its construction ( ig. 2) is similar to human arm interms of size, lifting capacity and manipulation capa-bilities,which enables it to safely operate in humanen-vironment. Just like typical industrial robots, LWR4+is equipped with motor encoders, but it also has sen-sors of joint position and torque. It can be controlled injoint space and external space, moreover the desiredtorques can be set in joints. KUKA provided the possi-bility to use LWR4+with external controllers, particu-larly for research purposes. The controller is equippedwith Fast Research Interface (FRI) [12] and uses Eth-ernet and UDP protocols, that allow to read robot’sstate and to send control commands.

In this work the impedance control law extendedby desired torque is used for joints control (5), in thesamewayasparallel position-force controller in exter-nal space extends position regulators with the desiredforce [16].

τκ+1c = [Kj ]

ιx([qd]

ιx − qκm) + [Bj ]

ιx q

κm+

[τd]ιx + f(qιm, qιm, qιm)

(5)

The control law above presents an external con-trol loop of real effector (∆κ ≈ 0.3ms), with desiredtorque τκc as output, which is then used as an inputto the internal loop with torque controller runningwith the frequency of tens of kHz. Some variables areupdated less frequently, typically with the period of∆ι = 1ms. Input values for extended impedance con-troller are as follows: [Kj ]

ιx - stiffness, [Bj ]

ιx - damp-

ing, [qd]ιx - desired position, [τd]ιx - desired torque.The impedance controller needs the measured posi-tion qκm and velocity qκm, and the dynamic model it-self f(qιm, qιm, qιm) . The structure of this regulator aswell as its stability analysis were presented in [1].3.2. KUKA LWR4+ research controller

The research controller (virtual effector e) con-sists of a number of Orocos components and an XML

ile that de ines the connection between them. Theindividual components are responsible for communi-cation within the entire system (including the use ofFRI), internal diagnostics, as well as the implementa-tion of the control law and trajectory generation. Inthis article, we focus on the latter two aspects.

Trajectory generator component Two key compo-nents used in the research controller are trajectorygenerator and impedance controller. Trajectory gen-erator receives commands from the control subsys-tem c in discrete time instants t. On the other side, ineach step ι of the real effector regulator E, trajectorygenerator has to prepare the command for impedancecontroller in external space (implemented in cartesianimpedance controller component). For two consecu-tive time instants i, the interval is de ined by [t]ix. Inthe initial phase, the desired position is copied fromthe measured position (6):

[ 0DT ]ι0y = 0ET ι0

m (6)

It is assumed that stiffness, damping, desired forceand the geometric matrix of the tool are simply copied(7):

[Kc]ιy = [Kc]

ix, [Bc]

ιy = [Bc]

ix,

[EFd]ιy = [EFd]

ix, [WET ]ιy = [WET ]ix

(7)

Desired position [ 0DT ]ιy is computed based on de-sired velocity [

E(D rd)]

ix and previous control (8) ac-

cording to the presented notation:

[ 0DT ]ιy = 0DT (ι−1)A

((DEξ

ι∗ [

E(D rd)]

ix)∆ι

)(8)

where matrix DEξ

ι∗ is computed in the way pre-

sented in [16] based on the matrix 0ET ι

m from (9) andmatrix 0

DT (ι−1).0ET ι

m = [ 0WTm]ιx [WET ]ix (9)

Impedance controller component Cartesian impe-dance controller component implements control lawin task-related space. Column vector E

Dr representingthe length of the virtual spring (10) is computed tak-ing into account the measured effector position 0

ET m

and the desired position 0DT .

EDrι = A−1(E0T ι

m [ 0DT ]ιx) (10)Velocity E rm of the end–effector (11) is computed

as:E rm

ι =A−1(E0T

(ι−1)m

0ET ι

m)

∆ι(11)

In the next step, the force EFc (12) is computedtaking into account the spring length E

Dr, stiffnessKc,end-effector velocity E rm, damping Bc and desiredforce EFd. The force EFc is then transformed to thewrist frame (13).

EF ιc = [Kc]

ιx

EDrι + [Bc]

ιx

E rιm + [EFd]ιx(12)

WF ιc = W

EξιF

EF ιc (13)

23


where WEξ

ιF is based on [WET ]ιx. Finally, according

to (14) joint torques vector τd is computed using Jaco-bian J .

[τd]ιy = [JT ]ιx

WF ιc (14)

Stiffness and damping in the output of the compo-nent are set to zero [Kj ]

ιy = 0, [Bj ]

ιy = 0, in order to

command the joint level controller implemented in thereal effectorE (5) to work as a torque controller withcompensation of both gravity and partially dynamics.

3.3. Door opening strategy

The control subsystem c includes outermost con-trol loop running over the existing impedance con-troller implemented in the virtual effector e. This so-lution enables the incorporation of the door openingtasks in a larger application, consisting of multiplephases of motion, by adjusting the impedance char-acteristics for smooth transition from the movementwithout any contact with the environment to the con-tact phase, as it is possible in the pneumatic-poweredrobots [5].

The matrix [WET ]iy de ines the pose of the manipu-lator tool. It depends on the mechanical constructionof the end–effector and it is sent to the virtual effec-tor. This matrix does not change during task execu-tion. Similarly, the general set force [EFd]

iy is always

zero (i.e. all of the force and torque coordinates are ze-ros). In addition, the time [t]iy of the single trajectorysegment execution is sent. The other output parame-ters are de ined in terms of the direction of motion inthe task space. For directionsx, y, ax, ay and az it is as-sumed that the manipulator should be compliant andmove in order to eliminate tension, so the correspond-ing coordinates of the reference velocity, stiffness anddamping (15) become zero.

[E(D rd[x,y,ax,ay,az])]

iy = 0, [Kc[x,y,ax,ay,az]]

iy = 0,

[Bc[x,y,ax,ay,az]]iy = 0

(15)Towards the z axis control law is more elabo-

rated, since it sets the velocity [E(D rd[z])]iy taking into

account the measured position 0ET i

m, the maximumspeed r[z], the measured force EF i

m[z] and two forcethresholds: braking threshold F0 and motion termi-nation threshold Fl. The stiffness [Kc[z]]

iy and damp-

ing [Bc[z]]iy are also de ined. In each iteration, the con-

troller irst checks the completion of the predicate ofthe movement (16), (17). The predicate P () returnstrue, if the force measured along the z axis of the taskcoordinate system exceeds the threshold value (dooris blocked) or the dot product of the vector that is nor-mal to the plane of the door at the beginning of themo-tion and the same vector at the current time instant isless than zero (the door turn angle exceeded 90 de-grees).

N i = [ 0ET im[1][3],

0ET i

m[2][3],0ET i

m[3][3]]

(16)

P () =

{true ifN i0 ·N i ≤ 0 ∨ [Fm[z]]

ix ≥ Fl

false otherwise(17)

If the predicate P () is satis ied, the desired ve-locity [

E(D rd[z])]

iy = 0. The robot motion is not

commanded, but the manipulator end–effector is stillimpedance controlled in z axis. Otherwise, the velocityis determined according to the formula (18), so as tocontinuously slow down the robot, starting from themoment when the measured force values exceed thebraking threshold F0.

[E(D rd[z])]

iy =

r[z] for EF im[z] ≤ F0

r[z]

(1.0−

EFim[z]−F0

Fl−F0

)for F0 < EF i

m[z] ≤ Fl

0 for EF im[z] > Fl

(18)

4. ExperimentsThe veri ication of the strategy of door open-

ing consists of a series of experiments conducted onthe test-bed presented in igure 2 and video1. Theend–effector of the manipulator is rigidly ixed to thekitchen cabinet door. The experiments consisted ofalternating closing and opening the door for differ-ent sets of controller parameters and limitations ofthe contact force. The control subsystem time periodwas constantly set to [t]iy = 10ms. In addition, doormotion was disturbed by an obstacle in the form ofheavy object standing in its way. Experiments havecon irmed the correctness of the approach. For the fur-ther presentation the door opening case was chosen( ig. 3), where the following coef icients where cho-sen experimentally. The velocity limit was set as r[z] =0.04m

s , the braking threshold F0 = 10N , the forcelimitFl = 20N , stiffness [Kc[z]]

iy = 4000N

m and damp-ing [Bc[z]]

iy = 80Ns

m .Initially, the manipulator is slightly pushing the

door along with the z axis of the current task coor-dinate system (it can be seen at the beginning of theforces graph), because the door openingwas precededby its closure, which ended up with a little stretchof impedance controller „spring” while the door wasclosed. Then, at the time instant a, the motion is or-dered, the force along the z axis changes direction,and inally, after crossing the static friction of hinge,door begins to move as illustrated by the signi icantincrease in speed along the z axis. The slight oscilla-tion of the forces is visible while the manipulator isin motion. The force in the z direction, which movesthe door, is clearly dominant over the values of theforces in the xy plane, where manipulator is com-pliant. The velocities in the xy plane are adequately

24


4 6 8 10 12 14− 15

− 10

− 5

0

5E

Fm

[N]

4 6 8 10 12 14− 0.06− 0.05− 0.04− 0.03− 0.02− 0.01

0.000.01

Er m

[m/s

]

z

z

xy

xy

time [s]

a b c

a b c

Fig. 3. The measured forces and veloci es recorded dur-ing door opening

small. At the time instant b the door encounter an ob-stacle in the form of a box weighing about 3kg, whichlies directly on the surface below the closet. The forcealong the z axis is growing rapidly, but in this casedoes not exceed the limit value. The speed drops dra-matically and starts to oscillate, while the manipula-tor is pulling the doors with the obstacle. Later, afterpassing through the phase of the oscillation caused bythe impedance controller behaviour, themeasured ve-locity stabilizes slightly below the limit. At this stage,noticeable higher force is seen, with which the ma-nipulator pushes the door against the resistance ofthe obstacle. Movement stops in time instant c, whenthe desired angle is reached. The force in the z direc-tion is non-zero, as the impedance controller „spring”is stretched due to non-zero static friction inside thedoor hinge and static friction of the obstacle with theground.

5. ConclusionsIn the article a research oriented controller of

KUKA LWR4+ robotic manipulator, that executes asafe door opening strategy, is presented in a formalway. This approach takes into account the limita-tions of both the speed of the door, from its handlepoint of view, and the contact force recorded in theend–effector of the manipulator. Experimental veri-ication con irmed the correct behavior of the sys-tem. Future research will rely on the solution alreadyworked out and will include the selection of the con-troller parameters (stiffness and damping) in the faceof the experimentally identi ied constraints on thespeed and strength of the contact. The work will be-gin with the study of the method of opening the doorby a representative group of people, and then the im-plementation of a similar strategy for the robot.

Notes1http://vimeo.com/rcprg/door-opening-stiff-contact

AUTHORTomasz Winiarski, Konrad Banachowicz, MaciejStefańczyk – Institute of Control and Compu-

tation Engineering, Warsaw University of Tech-nology, e-mail: [email protected], www:http://robotics.ia.pw.edu.pl.

ACKNOWLEDGEMENTSThe authors gratefully acknowledge the support ofthis work by TheNational Centre for Research andDe-velopment grant PBS1/A3/8/2012. TomaszWiniarskihas been supported by the European Union in theframework of European Social Fund through theWarsaw University of Technology Development Pro-gramme.

REFERENCES[1] A. Albu-Schäffer, C. Ott, and G. Hirzinger, “A uni-

ied passivity-based control framework for po-sition, torque and impedance control of lexi-ble joint robots”, The International Journal ofRobotics Research, vol. 26, no. 1, 2007, pp. 23–39.

[2] H. Bruyninckx, “Open robot control software: theorocos project”. In: International Conference onRobotics and Automation (ICRA), vol. 3, 2001, pp.2523–2528.

[3] M. Chacon-Murguia, R. Sandoval-Rodriguez, andC. Guerrero-Saucedo, “Fusion of door and cor-ner features for scene recognition”, Journal of Au-tomationMobile Robotics and Intelligent Systems,vol. 5, no. 1, 2011, pp. 68–76.

[4] W. Chung, C. Rhee, Y. Shim, H. Lee, and S. Park,“Door-opening control of a service robot usingthemulti ingered robot hand”, IEEE Transactionson Industrial Electronics, vol. 56, no. 10, 2009, pp.3975–3984.

[5] E. Jezierski, G. Granosik, and M. Kaczmarski,“Impedance control of jumping robot (in Pol-ish)”, Publishing House of Warsaw Universityof Technology, Elctronics, vol. 166, 2008, pp.185–194.

[6] T. Kornuta, T. Bem, and T. Winiarski, “Utilizationof the FraDIA for development of robotic visionsubsystems on the example of checkers’ playingrobot”, Machine GRAPHICS & VISION, 2013, (ac-cepted for publication).

[7] W. Meeussen, M. Wise, S. Glaser, S. Chitta, C. Mc-Gann, P. Mihelich, E. Marder-Eppstein, M. Muja,V. Eruhimov, T. Foote, et al., “Autonomous dooropening and plugging in with a personal robot”.In: IEEE International Conference onRobotics andAutomation (ICRA), 2010, pp. 729–736.

[8] G. Niemeyer and J. Slotine, “A simple strategy foropening an unknown door”. In: InternationalConference on Robotics and Automation (ICRA),vol. 2, 1997, pp. 1448–1453.

[9] C. Ott, B. Bäuml, C. Borst, and G. Hirzinger,“Employing cartesian impedance control for theopening of a door: A case study inmobile manip-ulation”. In: IEEE/RSJ international conference on

25

http://vimeo.com/rcprg/door-opening-stiff-contact


intelligent robots and systems workshop on mo-bile manipulators: Basic techniques, new trends &applications, 2005.

[10] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote,J. Leibs, E. Berger, R.Wheeler, and A. Ng, “ROS: anopen-source Robot Operating System”. In: Pro-ceedings of the Open-Source Software workshopat the International Conference on Robotics andAutomation (ICRA), 2009.

[11] A. Schmid, N. Gorges, D. Goger, and H. Worn,“Opening a door with a humanoid robot usingmulti-sensory tactile feedback”. In: InternationalConference on Robotics and Automation (ICRA),2008, pp. 285–291.

[12] G. Schreiber, A. Stemmer, and R. Bischoff, “Thefast research interface for the kuka lightweightrobot”. In: IEEE ICRA Workshop on InnovativeRobot Control Architectures for Demanding (Re-search) Applications–How toModify and EnhanceCommercial Controllers. Anchorage, 2010.

[13] D. Serrien andM.Wiesendanger, “Grip-load forcecoordination in cerebellar patients”, Experimen-tal brain research, vol. 128, no. 1, 1999, pp.76–80.

[14] M. Stefańczyk andW. Kasprzak, “Multimodal seg-mentation of dense depth maps and associatedcolor information”. In:Proceedings of the Interna-tional Conference on Computer Vision and Graph-ics, vol. 7594, 2012, pp. 626–632.

[15] T. Winiarski and A. Woźniak, “Indirect force con-trol development procedure”, Robotica, vol. 31,2013, pp. 465–478.

[16] C. Zieliński and T. Winiarski, “Motion genera-tion in theMRROC++ robot programming frame-work”, International Journal of Robotics Research,vol. 29, no. 4, 2010, pp. 386–413.

26


Articles 27


3D �� M�� R�� 3D �� M�� R�� 3D �� M�� R�� 3D �� M�� R��

�u��ed: 20th January 2013; accepted: 4th June 2013

Maciej Stefańczyk, Konrad Banachowicz, Michał Walęcki, Tomasz Winiarski

DOI: 10.14313/JAMRIS_4-2013/28

Abstract:

�he ar�cle presents a naviga�on system based on 3Dcamera and laser scanner capable of detec�ng a �iderange of obstacles in indoor environment. First, exist-ing methods of 3D scene data ac�uisi�on are presented.�hen the ne�naviga�on systemgathering data fromvar-ious sensors (e.g. 3D cameras and laser scanners) is de-scribed in a formal �ay, as�ell as exemplary applica�onsthat verify the approach.

Keywords:mobile robo�cs, naviga�on, ��-D sensing

�� trod�c�o�

An autonomous mobile robot navigation requiresrobust methods of environment analysis and obstacledetection. Typically, to achieve this goal, whole sets ofvarious sensors are used, some of them giving point-like readings (e.g. infrared or ultrasound range �ind-ers) [13], some scanning in a plane (e.g. lidars). Thatkind of sensor con�iguration allows a safe movementof the robot in an unknown environment, but there isstill a broad list of obstacles, that are hard to detectand avoid, most of them being either small (but dan-gerous) things laying on ground level or things hang-ing from top, that can hit upper parts of the robot ma-neuvering beneath them.

Current solutions, based on the aforementionedsets of sensors, have some shortcomings. Sometimesthere is necessity of special preparation of the envi-ronment to make it reachable for robots (by, for ex-ample, clear marking of dangerous zones), restrictionof common space used simultaneously by people androbots or making some strong assumptions about theenvironment (like riding only on smooth, �lat surfacewithout bumps or depressions [8]).

In the article a navigation method is presentedthat, thanks to utilization of data from both lidar and3D camera, overcomes the most of the above men-tioned shortcomings and is capable of detecting awiderange of obstacles in indoor environment. The robotcontrol system (sec. 2, 3) is formally speci�ied for theclearness of presentation and to simplify its furtherre-usage. On the other hand, presented control sys-tem still allows one to apply existing methods of se-curing robot surrounding, giving also possibility foreasy fusion of data from diversity of sensors (whichwas proven in experiments described in section 4).

�� e �et�od o� syste� s�ec��ca�o�Design of robot control systems requires a spec-

i�ication method that would facilitate its subsequentimplementation. Amultitude of robot controller archi-tectures was proposed in the past, e.g. the subsump-tion architecture [2], the blackboard architecture [17]or an integrated architecture presented in [1]. In thispaper we utilize a method based on earlier works fos-tering decomposing system into agents and descrip-tion of the behaviour of each agent in terms of transi-tion functions [19,21,22].

An embodied agent, being a central tenet of thismethod, is de�ined as any device or program hav-ing the ability to perceive its surroundings to sub-sequently in�luence the environment state, that maycommunicate with other agents and has an internalimperative to achieve its goal. In recent works [6, 20]the authors facilitated the inner structure of an embod-ied agent a, distinguishing �ive types of internal sub-systems in two groups: agent’s corporeal body (com-

Mobile base hardwareMobile base hardware

Mobile base controllerMobile base controller

Data aggregation, Localization, NavigationData aggregation, Localization, Navigation

MS Kinect driverMS Kinect driver Lidar driverLidar driver IMU interfaceIMU interface

MS KinectMS Kinect SICK LMS100SICK LMS100 GyroscopeGyroscope

Virtual effector

Real effector

e1

E1

c

r1

r2

Real receptor R1

Real receptor R2

x

rc1 x

rc2

r3

Real receptor R3

x

rc3

Control subsystem

y

rR1 y

rR2 y

rR3

Virtual receptor Virtual receptor Virtual receptor

x

Rr2x

Rr1 x

Rr3

y

cr2y

cr1 y

cr3

y

ec1 x

ec1

x

ce1 y

ce1

y

Ee1 x

Ee1

x

eE1 y

eE1

Depth image Range data Angular speed

Point cloud Point cloud IMU readings

Odometry information

Encoder readingsMotor velocities

Linear and angular speed

Fig. 1. Control system structure

28


Articles28


posed of its effector E and receptor R) and controlsystem (containing virtual effector e, virtual receptor rand a control subsystem c). Virtual effectors e and vir-tual receptors r form a hardware abstraction layer,presenting to the control subsystem both the state ofeffectors and receptors readings in a form that is con-venient from the point of view of control. They alsomake it easy to replace effectors or receptors with-out changes in a control subsystem, as long as newmodules are compatible with the created interface.All those subsystems communicate with each othervia buffers and, besides that, it is assumed that theypossess internal memory. Hence the evolution of thestate of each of those subsystems is de�ined in terms ofa transition function, transforming data taken from in-put buffers and the internal memory into values writ-ten to output buffers (andback to the internalmemoryas well) and sent subsequently to the associated sub-systems.

Aside of the brie�ly described decomposition theauthors also introduced data �low diagrams, supple-menting the transition functions in the task of descrip-tion of the behaviour of each of the subsystems consti-tuting the control system. They also introduced a con-sistent notation, simplifying the description of suchsystems, as follows. A one-letter symbol located in thecenter (i.e.E,R, e, r, c) designates a subsystem. To ref-erence its subcomponents or to single out the state ofthis subsystemat a certain instant of timeextra indicesaround the central symbol are placed. A left super-script designates the referenced buffer of the subsys-temor in the case of a function – its type. A right super-script designates the time instant at which the state isbeing considered. A left subscript tells us whether thisis an input (x) or an output (y) buffer. A right subscriptrefers to the numbers of: a subcomponent (if we haveonly one agent and one control system, the rest of orig-inal symbols can be omitted), an internal memory cellor the ordinal number of a function. In the followingwe will utilize this method for the description of thedesigned control systemofElektron [12]mobile robot.

The control system of Elektron mobile robot wasdesigned in a way that simpli�ies its usage not onlyin a variety of applications on this speci�ic robot, butalso on different hardware platforms, without manyassumptions about hardware structure.Whole systemwas designed as a single agent (depicted on �ig. 1),without any communicationbufferswith other agents,with a control subsystem c as its vital part. Aside fromcontrolling the system, to adapt the system to a par-ticular robotic platform, one must provide virtual re-ceptors and effectors implementation, that would in-terface the control system and real hardware.

The analysis of 3D scene acquisition methods [11]was a basis of choice of the adequate set of sensors forthis task. Virtual receptors responsible for obstacle de-tection (indescribedapplication r1 and r2) gatherdatafrom either MS Kinect or SICK lidar and convert thosereadings into common representation (a cloud of 3D

points). The transition functions of both modules areresponsible only for changing the representation, sothey are not described in detail. In short, in both vir-tual receptors the data read from input buffer R

x rk istransfered to c

yrk . Virtual receptor r3 is responsible forcommunicatingwith a gyroscope and generatingmes-sages of current angular speed. Buffer c

yr3 containsthis speed expressed in SI units (rad/s). On the otherside of control system is virtual effector, that trans-lates commands from common format (linear and an-gular speed in c

xe1) to velocity values for both mo-tors (Ey e1) and returns current robot position (wheelodometry c

ye1) calculated from readings from wheel-mounted encoders (Ex e1).

The control subsystem is divided into a set ofmod-ules responsible for different tasks: (i) robot local-ization (sec. 3.1) (local provided by f1 transfer func-tion and global by f3), (ii) data �iltering (sec. 3.2)(f2), (iii) obstacle mapping (sec. 3.2) and path plan-ning (sec. 3.3) (f4). Those parts are independent onthe hardware used, so the whole control system isportable between platforms. On the other hand, eachof its subparts can be freely replaced with other (con-forming to proper interface), so path planning algo-rithms, data �iltering, environment mapping etc. canbe either tuned or even replaced by other implemen-tations, more suitable for speci�ic task (without an im-pact on the whole structure of the agent).

There are three independent sourcesproducing in-formation about ourmobile robot relative localization.Wheel odometry (available through e

xc1 buffer), basedon incremental optical encoders and calculation of theposition using matched consecutive laser scans [3],give full information about robot position and orien-tation (x, y, θ). Gyroscope readings (rxc3), on the otherhand, can be used only to calculate a change in the ori-entation. All three sources return data with differentfrequencies, but ExtendedKalman Filter [15], which iswidely used in robotics, helps to incorporate all avail-able readings into �inal estimation of local robot posi-tion. Transition function f1 performing corrected locallocalization is depicted on �ig. 2.

x

ec1

i

x

rc3

i

x

rc2

i

S Li

cc1

i

f1

Polar scan

matching

Polar scan

matching

Extended

Kalman

filter

Extended

Kalman

filter

Global robot position (relative to provided map)is calculated using a particle �ilter with laser scans asinput data [16]. Additionally, current estimated posi-tion described earlier can be used in order to speedup calculations. It’s worth mentioning, that providedmap can be only a rough representation of the envi-ronment, containing onlywalls and other,most impor-

29


Articles 29


tant features, like doors or unmovable obstacles. Ad-ditional obstacles, that appear in laser readings, havevirtually no impact on localization results, as long asthey don’t cover most of the robot’s �ield of vision.Fig. 3 shows the structure of transition function f3 ex-ecuting global localization.

f3

AMCL

localization

AMCL

localization

cc1

i

x

rc2

i

cc3

i

cc4

i+1

Fig. �. �ransi�on f�n��on of a global lo�ali�a�on on pro�vided map

3.2. Mapping of the obstacles

During a robot movement in an environment,which is not fully known (or sometimes even totallyunknown), detecting andmapping of a current con�ig-uration of obstacles plays a vital role. In a static envi-ronment it’s suf�icient only tomark obstacles on amapafter detecting them, but when obstacles can move ordisappear, it’s clearing obstacles frommap what is es-sential (this part is much harder to do, since it’s hardto tell whether obstacle really disappeared or we justcan’t detect it at the moment). The whole process ofobstacle detection and mapping consists of consecu-tive steps of data aggregation from available sensors,�iltering it, marking obstacles in an intermediate, 3Drepresentation and �inally the creation of 2D costmapfor a motion planner.

3� �ep�esenta�on of the en�i�on�ent Out of thementioned steps, the intermediate 3D representationof the environment had to be considered �irst, becauseit affects all the other tasks. Methods based on geo-metricmodeling of surroundings [14] are successfullyused in tasks like a creation of 3D plans of buildingsor open spaces, additionally giving possibility to �il-ter out moving objects from �inal result. Octrees [5]are also widely used as a base structure when imple-menting obstaclemaps (in this case the biggest advan-tage is a memory usage ef�iciency). Both approacheshave a common drawback, which is methods of re-moving information about obstacles from them. In oc-trees, when a cell is marked as used, it’s hard to per-form inverse operation, leading to consistent results.An intermediate solution, containing features fromboth two- and three-dimensional representations islayeredmaps. In this case, the whole space is split intohorizontal layers, each representing a �ixed range ofvalues from a vertical axis (usually z axis). On eachlayer, an information can be stored either geometri-cally [10] or in �ixed rectangular grid (voxel repre-sentation). In the described system the key part isrobust obstacle marking and clearing in close robotsurrounding and there is less need for accurate map-ping of whole corridors (because dynamic character

of environment long-range maps with old informa-tion about obstacles could have negative impact onglobal path planners). Taking into account the men-tioned facts, a voxel representation (layered map withdiscrete rectangular grid) has been chosen as a basestructure for an obstacle map implementation.

�ata agg�ega�on an� �lte�ing The �irst step ofprocessing sensor readings and creating a motioncostmap is data aggregation itself. In the presented so-lution there are two sources giving information aboutobstacles – SICK LMS100 lidar and Microsoft Kinect.Both sensors produce high amount of data, which caneasily lead to overloading the processing unit. Takinginto account rather slow movement speed, to detectobstacles only 5 readings per second are used from li-dar and 2 per second from Kinect. Data from sensorscxrk (for k = 1, 2) is throttled to desired frequency, giv-ing throttled pointcloud TPk .

T Pki

F Pki

x

rc ki

cc2

i+1

Frequency

limitation

Frequency

limitationOutlier

removal

Outlier

removal

Deppresion

detection

Deppresion

detection

f2

Fig. 4. Filtering readings from available sensors

This pointcloud is then passed through the seriesof �ilters (�ig. 4), which remove incorrect points andproduce new, virtual ones. Incorrect points, being sen-sor noise, could lead to detecting non-existing obsta-cles, which sometimes make the goal unreachable. Toremove that kindof noise, everydata point that has toofew neighbours in its surrounding is removed (bothsensors produce dense sets of points, so there shouldbe no solitaire ones), producing �iltered cloud FPk .

z=0

z=10cmO

sensor

Intersection of rays with base plane

Real point (sensor reading) Virtual point (generated)

P

P'P''

Fig. �. �ete��ng dangero�s slopes

Afterwards, when false readings are removed, thenext �ilter processes the cloud of points in order to de-tect possibly dangerous �loor slopes (like stairs lead-ing down or holes in the ground). The mechanism issimple, but it gives very good results. All points hav-ing z coordinate over set threshold (e.g.−0.02m) passthrough this �ilter without any change, the rest is re-�lected by xy plane, thus point P = (x, y, z) becomesP ′ = (x, y,−z). This way a „virtual wall” is pro-duced (�ig. 5) at the distance of the closest detecteddepressed obstacle, but it’s still possibly far away from

30


Articles30


real beginning of dangerous zone, so in next step an-other set of virtual points is produced. A line, con-necting detected, negative point P with sensor originO = (ox, oy, oz), intersects with ground plane at somecoordinates (x′′, y′′, 0) (1). A point P ′′ is produced byraising the intersection point 10cm above the ground,so it creates a virtual obstacle for robot. Those virtualpoints, in result, force path planners to avoid detected,dangerous zones, making thewhole system safer. A �il-tered cloud is then stored in the internal memory ofthe control subsystem, in cc2.

P ′′ = (ox + (x− ox) · s, oy + (y − oy) · s, 0.1) (1)

with s = ozoz−z .

Voxel map All the points that passed the �ilteringstage, are put into an occupancy map creating mod-ule. Its internal representation is based on voxel grid(three dimensional cuboid grid). The manipulation ofgrid parameters allows to balance betweenhigher res-olution (leading to better mapping and allowing robotto ride through smaller gaps) and lower computa-tional load (for coarse grained maps) leading to fasterpath computation. A large size of cells, on the otherhand, could make some places unreachable, becauseeven small obstacles, like thin table legs, are repre-sented by whole cells being marked as occupied.

When a preprocessed pointcloud is passed to thisstage, prior to marking new obstacles, a removal pro-cedure is executed. For every point in current mea-surement the same algorithm is used – every cell, thatintersects with line connecting measured point withsensor origin, is marked as free. Afterwards all cells,inwhich points fromcurrentmeasurement are placed,are marked as occupied (this task is much simplerthan the clearing stage).

The order of those two passes is important – ifboth clearing and marking were done in a single pass(i.e. for every point clearing and marking executed atsight), a situation would be possible, where just cre-ated obstacles are cleared by consecutive processedpoints from the same pointcloud. A voxel representa-tion and the described algorithm allows an easy in-corporation of readings from multitude of sensors, inthis case lidar and Kinect are usedwithout any changein algorithm, with one important difference betweenthem – the points from Kinect are used for both clear-ing andmarking, lidar, returning only planar readings,is used only for marking purposes.

Final costmap Aftermarking all obstacles in an inter-mediate voxel grid, all cells are projectedonto a twodi-mensional costmap. All columns containing occupiedcells are marked as occupied, otherwise, the column’svalue depends on a number of unknown cells. If morethan 4 cells are unknown, then the entire column isconsidered unknown, in every other case it is markedas free to move.

After marking, all the obstacles are in�lated byspeci�ied value, creating safety zones around objects,

which gradually increase the movement cost for cells.Lowering in�lation radiusmakes robot go closer to ob-stacles, but can possibly lead to a situation, in whichrobot will ride along the wall so close, that turningis impossible without hitting it. Larger radius, on theother hand, will make some places unreachable.

Fig. 6. Local costmap. In the background a global mapis visible, darker cells mean occupied cells, safety zonesare marked light gray and white cells are empty zones.�ddi�onally, laser scan readings are marked as whitepoints.

A local costmap has a �ixed size and its origin isplaced always in the origin of the robot’s mobile base(�ig. 6). In unstructured environments with dynam-ically changing con�iguration of obstacles keeping amap of all obstacles may cause some troubles, thatwerementioned in earlier sections. A smaller costmaphas alsomuchbettermemory ef�iciency, because it hasa �ixed size instead of continuously growing as robotproceeds.

3.3. Path planning

When a goal is commanded to the robot (for ex-ample operator sets it manually), a global path plan-ner is executed. This module is exchangeable, so anyimplementation following the interface rules can beused at this point (so it’s one of few places, where fu-ture experiments using already created hardware andsoftware platform can be carried on, even concern-ing such complicated ones as a chaotic space explo-ration [4]). In test applications Dijkstra algorithmwasused as a global planningmethod, searching for a pathin a graph built upon a current global map. Initially,only the global map is used (in the case when no ob-stacles are detected yet) and if a plan can’t be executed(for example a door is closed), currently detected ob-stacles are included in a global path planning.

When a global planGR is created, its parts are sentto the mobile base controller, where a local path plan-ner (using Dynamic Window Approach) tries to com-mand the robot in a way that follows the given trajec-tory part the best. The local planner uses only a lo-cal costmap around the robot LM and current odom-etry readings e

xc1. Parameters for the local planner areset so that each move ends as near as possible to thefollowed trajectory. After experiments with this mod-ule and tweaking parameters values, the robot fol-lowed the given global path with rather high averagespeed (approx. 17cm/s, which is 75% of its maximumspeed), smoothly avoiding obstacles. General struc-

31


Articles 31


cc1

i

f4

y

ec1

i+1Obstacle

mapping

Obstacle

mapping

Local

planner

Local

planner

Global

planner

Global

planner

cc3

i

cc4

i

cc2

i

x

ec1

i

Emergency

behaviours

Emergency

behaviours

G Ri

LMi

�i�. �. �eneral structure of na�i�a�on subs�stem

ture of navigation subsystem is depicted on �ig. 7.

�.4. �e�� ete��n �n� s��in�

When the robot moves through an environment,where obstacles are potentially a-going, even bestplanning algorithms couldn’t avoid being stuck incertain situations (contrary to static environments).When that kind of situation emerges, a series of pos-sible solutions could be applied. First of all, of course,robot stops. Then all obstacles further that set thresh-old (half a meter) are cleared from the costmap androbot makes a full turn (360°, if possible) to refreshthe obstacle con�iguration. If at this point the robot isstill stuck, more aggressive algorithms are used, untilall possible solution methods are tried. If at any pointthe robot can continue following its route, the dead-lock is over.

4. ExperimentsThe experiments were conducted using Elektron

mobile robot [12], equipped with control hardwaredeveloped according to the methodology presentedin [18]. As a set of research oriented controllers, it pro-vides access to all the essential parameters and sincethe communication protcol is open, it allows to re-place any hardware module or plug an additional one.To make the high level software portable, the authorsdecided to base the control system implementationon ROS software framework [9], which is widely usedin robotics. In general, virtual effector e1 and virtualreceptors r1..3 are implemented as separate nodes,whilst control subsystem c is divided into multiplenodes, one for each transition function fi. Operationsinside transition functions (i.e. outlier removal or localplanner) are implemented as nodelets running insidecorresponding transition functions’ node. Communi-cation between nodes and nodelets (both transmis-sion buffers between subsystems and internal trans-missions) are mapped as ROS topics.

The experiments were conducted in of�ice andworkshop rooms, where many objects were con-stantly carried over the whole place. Those experi-ments aimed at con�irmation of propriety of chosenapproach. In some experiments robot moved throughthe corridor, where students moved chairs and tablesfrom place to place particularly often (�ig. 8), doors

�i�. �. �le�tron robot na�i�a�n� in tar�et en�ironment

were opened and closed, and, of course, like in [7],there were many people walking along (�ig. 8). Ac-cording to the assumptions, robot had to safely drivethrough avoiding all obstacles and try to reach a givengoal.

The basic scenario of each experiment was thesame – the robot had to reach a goal (set by operator)autonomously and safely, without any collision withobstacles. �arious sensor con�igurations were tested�only Kinect, only lidar and a fusion of data from both.Each experiment was repeated a number of times tocalculate average statistics.

In cases when only one sensor was used, the ad-vantage of lidar over Kinect in terms of overall move-ment speed was visible. A wide �ield of view of SickLMS100 (270°) makes it possible to plan the pathfaster, without the need for too many robot in-placerotation, which was necessary with narrow angle of57° Kinect sensor. Those rotations were used in or-der to �ind obstacles at the sides of the robot. On theother hand, Kinect detected many obstacles that wereinvisible for lidar, like of�ice chairs bottoms, which al-lowed setting much lower critical distance to obsta-cles during the ride (20cm in case with lidar, 5cm forKinect). This in turnmademuchmore goals reachablefor the robot, which precisely maneuvered throughnarrow spots (like almost closed doors). The biggestdisadvantage of Kinect in terms of obstacle detectionwas inability to detect narrow table legs, which waspotentially dangerous. When both data sources wereenabled at the same time, the drawbacks of each ofthem were eliminated by other sensor capabilities,which improved both overall robot speed and preci-sion of obstacle avoidance. Table 1 presents summa-rized statistics of conducted experiments.

Tab. 1. Summarized results for 20 repeats of each exper-iments

Sensors Avg. speed Obstaclecrit. dist.

Lidar 16cm/s 20cmKinect 12cm/s 5cmKinect + lidar 17cm/s 5cm

The key issue of the control system of a mo-bile robot operating in common space with humansis safety. During the tests, there were many peopleworking and walking along, in�luencing robot actions

32


Articles32


(sometimes even on purpose). This led to new obsta-cles appearing and, in extreme cases when robot wassurrounded by them, system stopped movement and,after a few rotations to �ind a clear path, aborted thegoal. In every case, the system properly recognizedstairs (when using Kinect) and, if commanded to go toa point unreachable because of them, the system re-turned an appropriate error message.

5. ConclusionsIn an unstructured and dynamic environment, like

of�ice and laboratory spaces, three dimensional sen-sors seem to be the most appealing method for fastand reliable detection of a wide variety of obstacles.Augmenting this data with wide-angle readings fromlidar further improves the detection results, makingit possible to avoid almost every solid obstacle (e.g.small, laying on ground, hanging from top, or narrowlegs). Additional CPU load caused by the processing ofthree-dimensional point clouds is compensated by anextended set of detectable obstacles, which makes thewhole system robust and safe.

The presented system can be used in unstructuredand cluttered environmentswithout their prior prepa-rations, in particular there is no need either to re-move anyobstacles, ormark safe zones. Thanks to this,a robot can be quickly deployed in a new environmentand during its work those places can be simultane-ously used by people. It makes a range of a new ap-plications possible (e.g. mobile robot assistant of theman) as the future work.

AcknowledgementsThe authors gratefully acknowledge the support

of this work by The National Centre for Researchand Development grant PBS1/A3/8/2012. TomaszWiniarski has been supported by the European Unionin the framework of European Social Fund through theWarsaw University of Technology Development Pro-gramme.

AUTHORSMaciej Stefańczyk∗ – Institute of Control and Compu-tation Engineering, Warsaw University of Technology,00–665 Warszawa, ul. Nowowiejska 15/19, e-mail:[email protected] Banachowicz – Bionik Robotics Sci-enti�ic Club, Warsaw University of Technology,00–665 Warsaw, Nowowiejska 15/19, e-mail: [email protected], www: http://bionik.ia.pw.edu.pl.Michał Walęcki – Institute of Control and Compu-tation Engineering, Warsaw University of Technol-ogy, 00–665 Warsaw, Nowowiejska 15/19, e-mail:[email protected] Winiarski – Institute of Control and Com-putation Engineering, Warsaw University of Technol-ogy, 00–665 Warsaw, Nowowiejska 15/19, e-mail:[email protected].∗Corresponding author

REFERENCES[1] R. Alami, R. Chatila, S. Fleury, M. M. Ghallab, and

F. Ingrand, “An architecture for autonomy”, Int.J. of Robotics Research, vol. 17, no. 4, 1998, pp.315–337.

[2] R. A. Brooks, “A robust layered control system fora mobile robot”, IEEE Journal of Robotics and Au-tomation, vol. 2, no. 1, 1986, pp. 14–23.

[3] A. Diosi and L. Kleeman, “Laser Scan Matching inPolar Coordinates with Application to SLAM”. In:IEEE/RSJ International Conference on IntelligentRobots and Systems, 2005.

[4] A. A. Fahmy, “Implementation of the chaotic mo-bile robot for the complex missions”, Journal ofAutomation Mobile Robotics and Intelligent Sys-tems, vol. 6, no. 2, 2012, pp. 8–12.

[5] D. Jung and K. Gupta, “Octree-based hierarchicaldistance maps for collision detection”. In: Inter-national Conference on Robotics and Automation(ICRA), vol. 1, 1996, pp. 454–459.

[6] T. Kornuta and C. Zieliński, “Behavior-based con-trol system of a robot actively recognizing handpostures”. In: 15th IEEE International Conferenceon Advanced Robotics, ICAR, 2011, pp. 265–270.

[7] B. Kreczmer, “Mobile robot navigation in thepresence of moving obstacles (in polish)”, Ad-vances in Robotics. Robot controll with surround-ing perception., 2005, pp. 177–186.

[8] E. Marder-Eppstein, E. Berger, T. Foote,B. Gerkey, and K. Konolige, “The of�icemarathon:Robust navigation in an indoor of�ice environ-ment”. In: International Conference on Roboticsand Automation (ICRA), 2010, pp. 300–307.

[9] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote,J. Leibs, R. Wheeler, and A. Y. Ng, “Ros: an open-source robot operating system”. In: ICRA work-shop on open source software, vol. 3, no. 3.2, 2009.

[10] P. Skrzypczynski, “2D and 3D world modellingusing optical scanner data”, Intelligent robots:sensing, modeling, and planning, vol. 27, 1997, p.211.

[11] M. Stefańczyk andW. Kasprzak, “Multimodal seg-mentation of dense depth maps and associatedcolor information”. In:Proceedings of the Interna-tional Conference on Computer Vision and Graph-ics, vol. 7594, 2012, pp. 626–632.

[12] W. Szynkiewicz, R. Chojecki, A. Rydzewski,M. Majchrowski, and P. Trojanek. “Modular mo-bile robot – Elektron (in Polish)”. In: K. Tchoń,ed., Advances in Robotics: Control, Perception andCommunication, pp. 265–274. Transport andCommunication Publishers, Warsaw, 2006.

[13] S. Thongchai, S. Suksakulchai, D. Wilkes, andN. Sarkar, “Sonar behavior-based fuzzy controlfor a mobile robot”. In: Systems, Man, and Cy-bernetics, 2000 IEEE International Conference on,vol. 5, 2000, pp. 3532–3537.

33


Articles 33


[14] S. Thrun, W. Burgard, and D. Fox, “A real-time al-gorithm for mobile robot mapping with applica-tions to multi-robot and 3d mapping”. In: Inter-national Conference on Robotics and Automation(ICRA), vol. 1, 2000, pp. 321–328.

[15] S. Thrun, W. Burgard, and D. Fox, ProbabilisticRobotics, The MIT Press, 2005.

[16] S. Thrun, D. Fox, W. Burgard, and F. Dellaert, “Ro-bust monte carlo localization for mobile robots”,Arti�icial intelligence, vol. 128, no. 1, 2001, pp.99–141.

[17] H. Van Brussel, R. Moreas, A. Zaatri, and M. Nut-tin, “A behaviour-based blackboard architecturefor mobile robots”. In: Industrial Electronics So-ciety, 1998. IECON’98. Proceedings of the 24thAnnual Conference of the IEEE, vol. 4, 1998, pp.2162–2167.

[18] M. Walęcki, K. Banachowicz, and T. Winiarski,“Research orientedmotor controllers for roboticapplications”. In: K. Kozłowski, ed., Robot Motionand Control 2011 (LNCiS) Lecture Notes in Con-trol & Information Sciences, vol. 422, 2012, pp.193–203.


[20] C. Zieliński, T. Kornuta, and M. Boryń, “Speci�ica-tion of robotic systems on an example of visualservoing”. In: 10th International IFAC Symposiumon Robot Control (SYROCO 2012), vol. 10, no. 1,2012, pp. 45–50.

[21] C. Zieliński, T. Kornuta, P. Trojanek, andT. Winiarski, “Method of Designing AutonomousMobile Robot Control Systems. Part 1: The-oretical Introduction (in Polish)”, PomiaryAutomatyka Robotyka, no. 9, 2011, pp. 84–87.

[22] C. Zieliński, T. Kornuta, P. Trojanek, andT. Winiarski, “Method of Designing AutonomousMobile Robot Control Systems. Part 2: An Exam-ple (in Polish)”, Pomiary Automatyka Robotyka,no. 10, 2011, pp. 84–91.

34


[14] S. Thrun, W. Burgard, and D. Fox, “A real-time al-gorithm for mobile robot mapping with applica-tions to multi-robot and 3d mapping”. In: Inter-national Conference on Robotics and Automation(ICRA), vol. 1, 2000, pp. 321–328.

[15] S. Thrun, W. Burgard, and D. Fox, ProbabilisticRobotics, The MIT Press, 2005.

[16] S. Thrun, D. Fox, W. Burgard, and F. Dellaert, “Ro-bust monte carlo localization for mobile robots”,Arti�icial intelligence, vol. 128, no. 1, 2001, pp.99–141.

[17] H. Van Brussel, R. Moreas, A. Zaatri, and M. Nut-tin, “A behaviour-based blackboard architecturefor mobile robots”. In: Industrial Electronics So-ciety, 1998. IECON’98. Proceedings of the 24thAnnual Conference of the IEEE, vol. 4, 1998, pp.2162–2167.

[18] M. Walęcki, K. Banachowicz, and T. Winiarski,“Research orientedmotor controllers for roboticapplications”. In: K. Kozłowski, ed., Robot Motionand Control 2011 (LNCiS) Lecture Notes in Con-trol & Information Sciences, vol. 422, 2012, pp.193–203.


[20] C. Zieliński, T. Kornuta, and M. Boryń, “Speci�ica-tion of robotic systems on an example of visualservoing”. In: 10th International IFAC Symposiumon Robot Control (SYROCO 2012), vol. 10, no. 1,2012, pp. 45–50.

[21] C. Zieliński, T. Kornuta, P. Trojanek, andT. Winiarski, “Method of Designing AutonomousMobile Robot Control Systems. Part 1: The-oretical Introduction (in Polish)”, PomiaryAutomatyka Robotyka, no. 9, 2011, pp. 84–87.

[22] C. Zieliński, T. Kornuta, P. Trojanek, andT. Winiarski, “Method of Designing AutonomousMobile Robot Control Systems. Part 2: An Exam-ple (in Polish)”, Pomiary Automatyka Robotyka,no. 10, 2011, pp. 84–91.

34


Articles34


O��-�� A�� M�� R�� R��

O��-�� A�� M�� R�� R��

O��-�� A�� M�� R�� R��

O��-�� A�� M�� R�� R��


Dominik Belter

DOI: 10.14313/JAMRIS_4-2013/35

Abstract:The paper presents a mo�on planning algorithm fora robot wal�ing on rough terrain. The mo�on�planer isbased on the improved RRT (Rapidly Exploring RandomTree��onnect algorithm. The �ar�cle �warm �p�mi�a��on (�� algorithm is proposed to solve a posture op��mi�a�on problem. The steepest descent method is usedto determine the posi�on of the robot�s feet during theswing phase. The gradient descent method is used forsmoothing the �nal path. The proper�es of the mo�onplanning algorithm are presented in four cases� mo�onplanning over a bump, concavity, step and rough terrainmoc�up. The maximal si�es of par�cular obstacle typestraversable by the �essor robot with the new, op�mi�edmo�on plan are given.

Keywords:wal�ing robot,mo�on planning, rough terrainlocomo�on

�� trod�c�o�Most of the outdoor mobile robots are equipped

with tracks or wheels. This kind of locomotion is veryef�icient on �lat terrain. When the terrain is rough themobility of such robots is limited by the size of obsta-cles (rocks, logs, etc.). Moreover, the tracked drive de-stroys the ground especially when the robot is turn-ing round. Walking is more ef�icient on rough terrain.Thus, to improve the mobility of legged mobile robotsonvarious types of terrainnewmotionplanningmeth-ods are needed. Recently, quadrupeds and bipedal hu-manoid robots are gaining popularity, but during Ur-ban Search and Rescue (USAR) missions multi leggedrobots are more suitable due to their inherent staticstability.

The nature of legged locomotion is discrete – dif-ferent than wheeled locomotion. Usually, the wheelsof a robot should have continuous contact with theground, whereas each leg of a walking robot has a se-quence of swing and stance phases. The feet of a robothave point-like contact with the ground. As a result,the locomotion task includes a sequence of robot’spostures and transitions between these postures (in-terpreted as discrete states).

Motion planning of a robot walking on rough ter-rain is a complex task. Walking robot can’t be repre-sented as a point in an environment with obstacles.While walking on rough terrain the robot should takeinto account full shape of the terrain. Collisionwith theground as well as static stability and workspace of therobot (a space which the feet of the robot can reach)

should be considered. Finally, motion planning for awalking robot is a problem with many constraints,and the solution has to be determined in a multi-dimensional solution space.

XR

YR

ZR

L2 L1

L3

L4

L5

L6

16

17

180.23 0.16

0.055

0.15 0.065

O[xR,yR,zR]T

�i�� inem�� s��e o� ��e �esso� �o�o� �� i�mensions in [m])

The posture of our walking robot (the state of therobot) Messor [24] is determined in a 24-dimensionalspace. The kinematic structure of the robot is pre-sented in Fig. 1. The position of the robot’s body inthe reference frame O is determined by a 3D vec-tor [xR, yR, zR]

T . Orientation of the robot’s body isgiven by three RPY (roll-pitch-yaw) angles θR, φR, γR.Each joint position is given by αi value. The num-bering of joints starts from the shoulder servomotorof the �irst leg, and ends on the joint between tibiaand femur of the sixth leg (Fig. 1). The posture is de-termined by 18 reference values for servomotors lo-cated in robot’s legs and six values representing po-sition and inclination of the robot’s platform (q =[α1, ..., α18, xR, yR, zR, θR, φR, γR]

T ). We are lookingfor methods which can deal with multi-dimensionalsolution space to �ind a path for the robot walking onrough terrain. Moreover, the sought algorithm shouldbe fast enough to operate in real time. To �ind feasi-blemotion of the robot themotion planner should �indtheway how to go round obstacles and, if necessary, to�ind the way how to climb obstacles.

�inematic structure of a robot in�luences the ap-proach to motion generation. Various approaches areused to obtain ef�icient locomotion on rough ter-rain. The rimless wheel structure is used in the RHexrobot [21]. The rotational motion of the shaft is con-verted into walking behavior through �lexible spokes.Robust locomotion on rough terrain can be also gen-

35


Articles 35


erated by using reactive controllers. The example ofsuch a robot is the BigDog [29]. Both methods will failwhenever the robot has to deal with very risky en-vironment e.g. while crossing the stream. Without afoothold selection method the robot will fall down. Inour approachwe precisely plan a full path of the robot.The robot determines its posture, selects footholds,plansmotion of the feet and body. It also checks stabil-ity, workspace and avoids collisions with the groundand self-collisions.

By planning full motion of the robot it is also possi-ble to avoid deadlocks (situationswhen the robot can’tmovebecause of the lack of stability or lack of the kine-matic margin). Planning motion of the robot also al-lows to avoid local minima (here understood as a loopofmoving backwards and forwards, oscillating aroundwhat is in these circumstances a lower-dimensionalitylocal minimum [1]), and to avoid obstacles. This be-havior is possible when a reactive controller is used.A reactive controller computes next action taking intoaccount a current context. Thus, the reactive controllermight cause ’in�inite loops’ in robot’s behavior, e.g. ac-cording to the current context the next motion of therobot is step forward, then step back and again for-ward, back, etc. To avoid the robot getting trapped inlocal minima the deliberative paradigm is used in thecontrol software of the machine.

The paper presents recent improvements of themotion planning method for the Messor six-leggedrobot. Thismethod is based onRapidly-exploringRan-dom Trees Connect algorithm (RRT- Connect) [16].During execution of the planned path the robot cre-ates an elevation map of the environment using theHokuyo�RG-04L� laser range �inder [3]. Then, the ob-tained map is used to plan the motion of the robot.Themethods presented in this paper increase the ef�i-ciency of themotion planningmethod.With the recentimprovements the robot is capable to climb higherobstacles by avoiding deadlocks. The deadlocks arecaused by the lack of static stability or the lack of akinematicmargin. The results are obtained using vari-ous optimizationmethods to �ind a path of feet duringswing phase, to optimize posture of the robot and tosmooth the obtained path.1.1. Related work

Autonomous walking on rough terrain is muchmore challenging than indoor locomotion [20]. Manyimpressive results were obtained using the LittleDogrobot [6, 19, 32]. In this case the robot also uses var-ious optimization methods during motion planning.The trajectory of the robot is optimized to maximizethe stability and kinematic reachability. In the workof Ratliff et al. [19] the trajectory optimization is per-formed using the gradient-based Covariant Hamilto-nian Optimization and Motion Planning (CHOMP) al-gorithm. Similarly, Kalakrishnan et al. proposed theuse of optimizationmethods to improve the trajectoryof the robot’s body with respect to the initial Zero Mo-ment Point (ZMP) trajectory [7]. To plan the motionof the robot the Anytime Repairing A* (ARA*) algo-rithm [18] is used. The algorithmmaximizes footholds

rewards from the initial to the goal position.Most of thewalking robots aswell as controlmeth-

ods are inspired by biological systems. Inspirations forsix-legged robot are taken from behavior of stick in-sects [15]. The robot can use a dynamic gait whichis based on results of Nature observations [30]. Pe-riodic signals to control gait of the robot can be ob-tained using a Central Pattern Generator (CPG). Theconcept of CPGs is based on the real nervous structurewhich can be found in living creatures [31]. This con-cept is also successfully applied in multi-legged walk-ing robots [9].

In our research we also take inspiration from bi-ology. Our robot uses statically stable gaits and theleg coordination patterns, as stick insect do. It walkson rough terrain using secure and rather slow type oflocomotion. The control system of the Messor robotstrongly depends on the exteroceptive sensory sys-tem. Dynamic control techniques require high-torquedrive units and precise proprioceptive sensing [14],which are not available to the Messor robot walkingon rough terrain. Thus, the dynamic capabilities of ourrobot are limited.

Not only optimization and inspirations from theNature can be used to obtain control system for walk-ing robot. The robot can also learn how towalk. In [26]a modi�ied Reinforcement Learning algorithm is pre-sented. It was shown that the robot can utilize past ex-perience and faster learn ef�icient locomotion on �latterrain. Similarly, a gait of humanoid robot can be op-timized to obtain maximal linear velocity [27].

Some procedures for control of six-legged robots,which allow them to climb high, but structural ob-stacles like steps [5] and stairs [25] are known fromthe literature. However, in our work we don’t as-sume particular shapes of the obstacles. The robot au-tonomously can deal with various obstacle. This be-havior is obtained using RRT-based planner and opti-mization methods which support motion planning.

In ourworkweutilize hybrid deliberative/reactiveparadigm.When the initial path of the robot is plannedthe robot can execute the planned motion using re-active and fast controller which guarantees more ro-bust locomotion. Moreover, during execution of theplanned motion the robot uses simple leg compliantstrategy. The robot stops the leg’s motion whenevercontact force at leg’s tip exceeds given thresholds. Thecompliant strategy is very important to robust loco-motion [10,31] or motion of a robot arm [13,28].

�. �o�o� �la�� al�or�t��

Find maximal possiblelength of robot’s step

dSTEP

NO

k>0?

Optimize trajectoriesof the feet

Check stability,workspace, collisions,for each point of the

path

FINISHFAILURE

STARTk:=1

Select foothods for robot’sconfigurationq (k d )

simp· STEP

Opimizeposture

YES

k k:= - 0,2

YES

Is motionpossible?

FINISHSUCCESS

NO

Fig. 2. �� p�� a�� nplanning

36


Articles36


The RRT-Connect-based integrated motion plan-ner [2] creates two trees, which are extended alter-nately. The root of the �irst tree is located in the initialposition of the robot. The root of the second tree is lo-cated in the goal position of the robot. At the beginningthe algorithm tries to add new node to the �irst tree(E�� procedure). The position of the new node isdetermined randomly. If the robot can reach the nextgoal position q(k · dSTEP) the node is added to the tree.Then, the second tree is extended in the direction tothe new node. In the following iteration of the algo-rithm the sequence is swapped and the second tree isextended in the random direction at the beginning.

The core of theRRTalgorithm is theE��proce-dure (Fig. 2). The procedure checks if the transition inthe desired direction is possible. If themotion is possi-ble the procedure plans the precise path for the bodyand each foot. The previous version of the procedurehas beenmodi�ied [2]. First, the new procedure deter-mines the maximal step length in the desired direc-tion (dSTEP). The maximal step length is found usingkinematic and collision constraints. To determine themaximal lengthof the stepwecreate the initial postureof the robot qsimp. The procedure assumes initial hori-zontal orientation of the robot’s body. The distance tothe ground is set to such a value that guarantees se-cure clearance between robot’s body and the ground(in experiments the clearance is set to 0.1m). The sub-procedure, which plans the path for the next robotstep, is executed �ive times for various step lengthsk · dSTEP (k ∈ {0.2; 0.4; ...; 1.0}). For the next poten-tial node of the tree the robot �inds footholds and op-timizes the posture [3, 4]. Next, the procedure planspath of feet during swing phase. The path of the bodyis straight between two neighboring nodes. Then, theprocedure checks if the execution of the planned pathis possible. Workspaces of the robot’s feet, collisionsand static stability are checked. Theprocedure returnsthe planned path for themaximal possible step length.

The RRT algorithm determines a new (xR,yR) po-sition of the robot in the global coordinate system. Forthis position the planning procedure determines fullstate (posture) of the robot (vertical position of therobot’s platform zR, inclination and leg’s con�igura-tion). To this end, the elevation map of the environ-ment is used. The direction of motion is representedby the vector

−−−−→piRp

i+1R identifying the previous position

of the robot piR = [xi

R, yiR] and the next position of the

robot pi+1R = [xi+1

R , yi+1R ]. The orientation of the robot

on the surfaceγR is equal to the angle between thevec-tor

−−−−→piRp

i+1R and x axis of the global coordinate system.

If the desired rotation around z axis can’t be executedwith the given kinematics of the robot (the referenceangle γR is not reachable) the reference rotation an-gle γR is limited to maximal rotation which can be ex-ecuted by the robot in a single step. In all experimentspresented in the paper the maximal rotation angle isset to 0.2 rad.

For the new position pi+1R = [xi+1

R , yi+1R ] the robot

selects footholds [3]. Inclination of the robot’s plat-form (θR, φR) and distance to the ground zR are de-termined by the posture optimization procedure.

The optimization procedure searches for the vec-tor pR = [θR, φR, zR]

T which determines the pos-ture of the robot with maximal kinematic margin ofthe robot dKM:

arg maxpR

{dKM(q(pR))},q(pR) ∈ Cfree,q(pR) ∈ Cstab,

(1)

where:q(pR) – posture of the robot for the given con�ig-uration pR and position of feet determined usingfoothold selection algorithm,dKM(q(pR)) – kinematic margin of the robot,Cfree – con�igurations of the robot which are col-lision free (lack of collisions with the ground andbetween parts of the robot) and inside of the robot’sworkspace,Cstab – statically stable con�igurations of the robot.

To �ind theoptimal postureof the robot theParticleSwarm Optimization (PSO) algorithm is used [4, 12].The kinematic margin diKM of the i-th leg is de�ined asthe distance from the current position of the feet tothe boundary of the reachable area of the leg [17]. Tocompute kinematic margin dKM of the robot the dis-tance from current positions of the feet to the bound-aries of the legs workspace are computed. The small-est distance (min(di=1

KM ), ..., di=6KM )) determines the dKM

value [4]. The computation of the kinematic marginshould be fast. It is important because the kinematicmargin is computed hundred of times during singleoptimization run. The analytical relation between con-�iguration of the leg and the value of the kinematicmargin is found to speed up the computations [4]. Toobtain this relation, the approximation with mixtureof Gaussian functions is used.

When the posture of the robot is determined therobot plans the path of each foot from the currentto the next foothold. The initial paths are located onthe plane which is perpendicular to the gravity vector.The initial and the goal footholds are located on thisplane. The shapeof the initial path is createdaccordingthe the vertical cross-section of the obstacles betweenfootholds [2]. The safety margin is added to decreasethe risk of collision. However, the found path does notguarantee proper execution of the planned motion.Some points of the planned path might be unreach-able for the robot (the positions of feet are outside ofthe robot workspace). Moreover the legs of the robotmight collide with other parts of the robot. Thus, it isnecessary to modify the position of feet in the direc-tion perpendicular to the direction of the foot motion.We call it leg-end optimization during swing phase.During the optimization the inverse and forward kine-matic of the leg are known, thus we can switch easilybetween the leg con�iguration and position of the foot.

37


Articles 37


We use both representations in the optimization pro-cedure. For example we use leg con�iguration to com-pute kinematic margin and we use position of the footto check collisions with the ground.

�ig� �� ss�se��n �� he �eg w��spa�e an� p��g�ess�� he g�a�ien��ase� �p��i�a��n �e�h�� ingswing phase

�ig� �� p��i�a��n �� he �eg�en� p�si��n ��ing swingphase �� he ��wing p�in�s �� he ini�a� pa�h

The goal of the optimization (in ourcase sub-optimization) is to �ind the po-sition of the i-th foot during swing phasepif = [xi

f , yif , z

if ]

T (expressed in the global coor-dinate systemO) which satis�ies the requirements forthe value of the i-th leg kinematic margin diKM:

diKM(pif ) > dsafeKM ,pi

f ∈ Cfree, (2)

where Cfree are leg’s con�igurations which are colli-sion free and inside of the workspace of the robot anddsafeKM = 8 cm.

We decided to use sub-optimization instead offull optimization because looking for leg-end positionwhich satis�ies safety requirements is suf�icient. Thesub-optimization stops searching when the kinematic

margin is bigger than dsafeKM = 8 cm. This approachspeeds up the optimization procedure. The procedureis run for each leg-end position and all points of theinitial path during swing phase (Fig. 4).

To �ind the optimal foot position the method ofsteepest descent is used [11]. The cross-section of theleg workspace and progress of the gradient-based op-timization method during swing phase are shown inFig. 3. If the initial position of the foot FSTART is out-side of the workspace of the leg the robot searchesfor the acceptable leg’s con�iguration (reference val-ues for servomotors) which has positive value of thekinematic margin. To this end, the current positionof the foot is moved iteratively in eight main direc-tions (Fig. 3). When the foot is inside of the workspaceof the robot, the procedure continues with gradient-based optimization. The position of the foot is mod-i�ied in the direction perpendicular to the directionof the foot motion (vertical cross-section through theleg’s workspace). The search space is limited by theshape of the obstacles increased by the safety margin(in the simulations presented in this paper the safetymargin is 3 cm).

2.3. Path smoothing

The �inal path found by the RRT-based algorithm ismade from straight lines which connect optimal pos-tures of the robot. Thus, the motion of the robot is notsmooth. In the nodes of the tree (positions of the robotwhere all feet are located on the ground and optimalposition is determined) the robot rapidly changes thedirection of themotion. To avoid rapidmovements weuse path smoothing on the results obtained from theRRT-based planner [8].

The goal of the path smoothing is the minimiza-tion of �itness function Fsm. The arguments of Fsm arethe positions of points in the initial path p found bythe RRT-based algorithm, and positions of the pointsin the smooth path p′:

arg minp′

Fsm(p′) =

n∑i=0

{α2 (pi − p′i)

2 + β2 (p

′i − p′i−1)

2 + β2 (p

′i − p′i+1)

2}.

(3)The element of the �itness function Fsm which de-

pends on the β coef�icient is responsible for pathsmoothing. Additionally, the element which dependson the α coef�icient is introduced. Without this ele-ment the optimal solution is a straight line (the po-sitions of the initial and �inal points are not opti-mized). To �ind the optimal value of the function (3)the gradient-based optimization is used [11]. At thebeginning of the optimization the smoothed path p′ isthe same as the initial path p. In the following step ofthe optimization the algorithm modi�ies the positionof the new path according to the equation:

p′newi = p′i + α(pi − p′i) + β(p′i−1 + p′i+1 − 2p′i).

(4)

38


Articles38


The positions of footholds are not modi�ied by theoptimization. Similarly, the points of the path are notmodi�ied if the new position is outside of the robotworkspace, if it causes collisions, or the new positionof the robot is not statically stable. Theprocedure endswhen the new modi�ication of the path ∆p is bellowthe pre-determined value∆max:

∆p =n∑

i=0

|p′newi − p′i| < ∆max. (5)

3. ResultsThe results of the procedure which optimizes the

position of feet during swing phase is presented inFig. 5. During the experiment the robot was climbinga step. When the optimization is used the robot modi-�ies the initial path of feet.Without using the proposedmethod the desired position of foot is outside of therobot’s workspace. The execution of the planned mo-tion is not possible. When the proposed optimizationprocedure is used the robot moves its leg sideway toincrease kinematic margin to avoid collisions with theobstacle. This optimization procedure allows to climbhigher obstacles and avoid deadlocks.

-1

-0.5

0

0.5

1

x [m]

-0.4

-0.35

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

y [m]

0

0.1

0.2

0.3

0.4

z [m] initial pathfinal pathfoothold

not executable(before optimization)

executable(after optimization)

Fig. �. Results of the foot path op�miza�on procedureduring swing phase on the step obstacle

-30

-25

-20

-15

-10

-5

0

x [cm]

-10

-5

0

5

10

y [cm] RRT pathα β=0.8, =0.2α β=0.1, =0.9

α β=0.01, =0.99

Fig. 6. Results of the path smoothing procedure for var-ious α and β parameters

Before execution of the planned motion the pathis smoothed. Th sequence of the body positions[xR, yR, zR, θR, φR, γR] as well as path of each foot ismodi�ied. Example results of path smoothing for vari-ous values of parameters are shown in Fig. 6.When the

parameter β is increased and the parameter α is de-creased the obtained path is smoother. when param-eter α is increased and parameter β is decreased theobtained path is similar to the initial path (Fig. 6).

The parameters are set to α = 0.01 and β =0.99 when the path for the body is smoothed. The pa-rameters are set to α = 0.8 and β = 0.2 whenthe paths of feet are smoothed. The goal is to obtainsmooth path for the robot’s body to limit unwantedsuddenmotions. The paths of feet aremore constraint(workspace of the robot and collisionswith obstacles).The footholds can’t be modi�ied in this stage thus thepath is only slightly smoothed.

The proposed methods were veri�ied in the re-alistic simulator of the Messor walking robot [24].Dynamic properties of the robot are simulated us-ing the Open Dynamics Engine [22]. The software li-brary detects collisions between parts of the robot’sbody and simulates behavior of the colliding objects(static and dynamic friction). The simulator also con-tains the software controller of the real robot. The en-vironment, the state of the robot, and paths obtainedusing themotionplanning algorithmarepresentedus-ing OpenGL procedures.

The properties of the proposedmethodswere pre-sented in simulations using three various obstacles1: abump (Fig. 7a), concavity (Fig. 7b) and step (Fig. 7c).Using standard shape of the obstacles we can presentef�iciency of the proposed methods. We found themaximal size of the obstacles which can be covered bythe robot using various variants of the motion plan-ning approach (Table 1). It should be also mentionedthat the proposed method is universal and does notassume any particular model of the obstacle.

Tab. 1. Maximal size of the obstacles which can be cov-ered b� the robot using various variants of the mo�onplanning approach. Table presents three variants: mo-�on planningwithout op�miza�on �1�� with posture op-�miza�on �� and with all op�miza�on procedures ��.

variant bump [m] concavity [m] step[m]Fig. 7a Fig. 7b Fig. 7c

1 0.13 -0.17 0.152 0.16 -0.185 0.183 0.16 -∞ 0.25

To show the ef�iciency of using presented opti-mization procedures we performed simulations withthree different obstacles: a bump, a concavity and astep. In the �irst simulation the robot plans its mo-tion using only the RRT algorithm. The position of therobot’s platform is determined using a simple strat-egy – the orientation of the robot’s body is the sameas the average slope of the terrain. The distance to theground is set to a value which is minimal but safe andguarantees that the motion is collision free [2]. In thesecond simulation the robot uses posture optimiza-tion algorithm to determine its posture in the nodes oftheRRT trees. In the third simulation the robot uses allmethods presented in this paper (posture optimiza-tion, feet path planning during swing phase and path

39


Articles 39


-0.50

0.51

1.5

x [m]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

y [m]

0

0.1

0.2

0.3

0.4

0.5 z [m]

body pathfeet pathRRT branches

-0.50

0.51

1.5x [m]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

y [m]

-0.1

0

0.1

0.2

0.3 z [m]

-0.50

0.51

1.5

x [m]

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

y [m]

-0.1

0

0.1

0.2

0.3

0.4

0.5 z [m]



a1 a2

b1 b2

c1 c2

�i�� e p�anne� pat� o� t�e �obot �� an� e�ec��on o� t�e p�anne� �o�on in si��ato� �� i�e c�i�bin�� a – b��p,b – concavity, c – step

smoothing).When the robot does not use optimization meth-

ods themaximal height of the bumpwhich can be cov-ered by the robot is 13 cm (Tab. 1). Using posture op-timization procedure the robot is capable to climb abump which is 3 cm higher. The maximal height ofthe obstacle is the same when the optimization dur-ing swing phase is used. To deal with the obstacle therobot has to put its legs on the ground level and onthe obstacle at the same time. If the obstacle is toohigh the reference position of the the feet are outsideof the workspace of the robot. The optimization dur-ing swing phase does not bring additional advantages.The path obtained using all optimization methods isshown in Fig. 7a.

When the posture optimization is used duringmo-tion planning the maximal depth of the concavity is18.5 cm. If necessary the robot decreases the distanceto the ground to reach the bottom of the concavityand safely moves to the other side of the ditch. Whenall optimization methods are used the depth of theconcavity does not play any role. The �� proce-dure (Fig. 2) checks various step lengths. For the givenwidth of the ditch the robot does not have to placeits feet on the bottom of the concavity. The robot canmake longer steps and place feet on the other side ofthe concavity. The �� procedure proposed in thispaper allows to autonomously modify length of stepsto deal with various obstacles. In this case path opti-mizationduring swingphasedoesnot bring additionaladvantages. The initial path is secure and guaranteesa proper kinematic margin.

The advantages of using optimization procedureduring the swing phase can be observed during stepclimbing. The obtained path is shown in Fig. 7C. Themotion of feet is modi�ied to increase kinematic mar-gin during swing phase. As a result the robot movesfeet sideways when they are above the edge of thestep. The proposed strategy allows to deal with steps25 cmhigh. If only the posture optimization procedureis used themaximal height of the step is 18 cm (Tab. 1).

To check the ef�iciency of the algorithm on terrainwith irregular obstacles a simulation on rough terrainmockup was performed. The results are presented inFig. 8. The distance between initial and �inal positionof the robot is 3 m. Because the algorithm is random-based, the series of 10 simulations was performed. Ineach trial the robotwas able to �ind a securepath to thegoal position. The average time of the path planningis 389 s and standard deviation is 44 s (simulationswere performed on a computer with Intel i7 2.7GHzprocessor).

4. ConclusionsThe article presents various optimizations strate-

gies to planmotion of the six- legged walking robot onrough terrain. The PSO optimization algorithm is usedto �ind posture of the robot during phase when all feetof the robot are placed on the ground. The gradient-based optimization method is used to determine feetposition during swing phase. A similar method is usedfor smoothing the path returned by RRT-based mo-tion planner. The presented methods are used as core

40


Articles40


elements of the new E�� procedure which deter-mines the following step of the robot.

The goal of using the optimization methods is tocreate algorithmwhich is capable toplanmotionof therobot on rough terrain with high obstacles. The capa-bilities of the algorithmwere presented in the simula-tor. We determined the maximal size of the obstacleswhich can be covered by the robot using the presentedmethods. The goal of the optimization can be easilymodi�ied. It could be the maximal security margin (asum of three weighted coef�icients: value of distancebetween feet and obstacles, kinematicmargin and sta-bility margin) or combination of few coef�icients. Inthe future we are going to implement the proposedoptimization methods on the real Messor robot. Tothis end, we are going to integrate the mapping algo-rithmwhich uses �okuyo laser range �inder and a self-localization algorithms.

5. AcknowledgementThis work was supported by NCN grant no.

2011/01/N/ST7/02080.The author would like to thank the anonymous re-viewers for their valuable comments and suggestionsto improve the quality of the paper.

Notes1A video with simulation experiments is available on

http://lrm.cie.put.poznan.pl/jamris2013.wmv

AUTHORDominik Belter∗ – Poznań University of Technol-ogy, Institute of Control and Information Engineering,ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail: [email protected].∗Corresponding author

REFERENCES[1] G. B. Bell, M. Livesey, ”The Existence of LocalMin-

ima in Local-Minimum-Free Potential Surfaces”,Towards Autonomous Robotic Systems (TAROS2005), London, UK,2005 pp. 9–14.

[2] D. Belter, P. Skrzypczyński, Integrated motionplanning for a hexapod robot walking on rough

terrain, 18th IFAC World Congress, Milan, Italy,2011, pp. 6918–6923

[3] D. Belter, P. Skrzypczyński, ”Rough terrain map-ping and classi�ication for foothold selectionin a walking robot”, Journal of Field Robotics,vol. 28(4), 2011, pp. 497–528

[4] D. Belter, P. Skrzypczyński, ”Posture optimiza-tion strategy for a statically stable robot travers-ing rough terrain”. In: Proc. IEEE Int. Conf. on In-telligent Robots and Systems, Villamoura, Portu-gal, 2012, pp. 2204–2209,

[5] S. Fujii, K. Inoue, T. Takubo, T. Arai, ”Climbing uponto Steps for Limb Mechanism Robot “ASTER-ISK””, In: Proc of the 23rd International Sympo-sium on Automation and Robotics in Construction,2006, pp. 225–230

[6] M. Kalakrishnan, J. Buchli, P. Pastor, M. Mistry, S.Schaal, ”Fast, robust quadruped locomotion overchallenging terrain”. In: Proc. IEEE Int. Conf. onRobotics and Automation, Anchorage, USA, 2010,pp. 2665- -2670

[7] M. Kalakrishnan, J. Buchli, P. Pastor, M. Mistry,S. Schaal, ”Learning, planning, and control forquadruped locomotion over challenging terrain”,Int. Journal of Robotics Research, vol. 30(2), 2010,pp. 236– 258.

[8] D. Dolgov, S. Thrun, ”Detection of principal direc-tions in unknown environments for autonomousnavigation”, In: Proc. of the Robotics: Science andSystems, 2008

[9] A. Gmerek, ”Wykorzystanie quasi-chaotycznychoscylatorów do generowania rytmu chodurobotów kroczacych”, Postępy robotyki 2012, K.Tchoń, C. Zieliśki (Eds.).

[10] A. Gmerek, E. Jezierski, ”Admittance control ofa 1-DoF robotic arm actuated by BLDC mo-tor”. In: 17th International Conference onMethodsand Models in Automation and Robotics (MMAR),2012, pp. 633–638.

[11] C.T. Kelley, Iterative methods for optimization,SIAM, Philadelphia, 1999.

[12] J. Kennedy, R.C. Eberhart, ”Particle swarmoptimization”. In: Proc. IEEE Int. Conf. on

41


Articles 41


Neural Networks, Piscataway, Australia, 1995,pp. 1942–1948.

[13] M. Kordasz, R. Madoński, M. Przybyła, P.Sauer, Active Disturbance Rejection Controlfor a Flexible-Joint Manipulator, Lecture Notesin Control and Information Sciences, RobotMotion and Control 2011, K. Kozłowski (Ed.),Springer-Verlag, 2011, pp. 247–256.

[14] M. Kowalski, M. Michalski, D. Pazderski,Quadruped walking robot WR-06 - design,control and sensor subsystems, Lecture Notes inControl and Information Sciences, Robot Mo-tion and Control 2009, Springer-Verlag, 2009,pp. 175–184.

[15] S. Krenich, M. Urbanczyk, ”Six-Legged WalkingRobot for Inspection Tasks”, Solid State Phenom-ena, vol. 180, pp. 137–144, 2011

[16] J.J. Kuffner, S.M. LaValle, ”RRT-Connect: An ef�i-cient approach to single-query path planning”.In: Proc. IEEE Int. Conf. on Robotics and Automa-tion, San Francisco, USA, 2000, pp. 995–1001.

[17] P. D. Kumar, D. Kalyanmoy, and G. Amitabha, ”Op-timal path and gait generations simultaneouslyof a six-legged robot using a GA-Fuzzy approach”,Robotics and Autonomous Systems, vol. 41, no. 1,2002, pp. 1-20.

[18] M. Likhachev, G. Gordon, S. Thrun, ”ARA*: any-time A* with provable bounds on suboptimal-ity”.In: Advances in Neural Information Process-ing Systems 16: Proceedings of the 2003 confer-ence, 2003.

[19] N. Ratliff, M. Zucker, J.B. Andrew, S. Srini-vasa, ”CHOMP:Gradient optimization techniquesfor ef�icient motion planning”. In: Proc. IEEEInt. Conf. on Robotics and Automation, Kobe,Japan, 2009, pp. 489– 494.

[20] R.B. Rusu, A. Sundaresan, B. Morisset, K. Hauser,M. Agrawal, J.C. Latombe, M. Beetz, ”Leav-ing �latland: ef�icient real-time 3D perceptionand motion planning”, Journal of Field Robotics,vol. 26(10), 2009, pp. 841–862.

[21] U. Saranli, M. Buehler, D.E. Koditschek, ”RHex:a simple and highly mobile hexapod robot”, In-ternational Journal of Robotics Research, vol. 20,2001, pp. 616–631.

[22] R. Smith, Open Dynamics Engine,http://www.ode.org, 2012.

[23] K. Walas, D. Belter, ”Supporting locomotive func-tions of a six-leggedwalking robot”, InternationalJournal of Applied Mathematics and ComputerScience, vol. 21, no.2, 2011, pp. 363–377.

[24] K. Walas, D. Belter, ”Messor – Versatile walkingrobot for search and rescue missions”, Journalof Automation, Mobile Robotics & Intelligent Sys-tems, vol. 5, no. 2, 2011, pp. 28-34.

[25] K. Walas, A. Kasiński, ”Controller for Urban Ob-stacles Negotiation with Walking Robot”. In:Proc. IEEE Int. Conf. on Intelligent Robots and Sys-tems, Villamoura, Portugal, 2012, pp. 181–186.

[26] P. Wawrzyński, ”Real-Time ReinforcementLearning by Sequential Actor-Critics and Expe-rience Replay”, Neural Networks, vol. 22, no. 10,Elsevier, 2009, pp. 1484-1497.

[27] P. Wawrzyński, ”Autonomous ReinforcementLearning with Experience Replay for HumanoidGait Optimization”, Proceedings of the Interna-tional Neural Network SocietyWinter Conference,Procedia, 2012, pp. 205–211.

[28] T. Winiarski, A. Woźniak, ”Indirect force con-trol development procedure”,Robotica, vol. 31, 4,pp. 465–478, 2013.

[29] D. Wooden, M. Malchano, K. Blankespoor, A.Howardy, A.A. Rizzi, ”Raibert, Autonomous nav-igation for BigDog”. In: Proc. IEEE Int. Conf. onRobotics and Automation, Anchorage, USA,2010, pp. 4736–4741

[30] T. Zielińska, ”Biological inspiration used forrobots motion synthesis”, Journal of Physiol-ogy – Paris, vol. 103(3-5), September 2009,pp. 133–140, 2009.

[31] T. Zielińska, A. Chmielniak, ”Biologically inspiredmotion synthesis method of two-legged robotwith compliant feet”, Robotica, vol. 29, no. 7,2011, pp. 1049–1057.

[32] M. Zucker, J.A. Bagnell, C.G. Atkeson, J. Kuffner,”An optimization approach to rough ter-rain locomotion”. In: Proc. IEEE Int. Conf. onRobotics and Automation, Anchorage, USA, 2010,pp. 3589–3595

42


Articles42


�� R�� S��-�� S�� S��

�� R�� S��-�� S�� S��

�� R�� S��-�� S�� S��

�� R�� S��-�� S�� S��


Michał Nowicki, Piotr Skrzypczyński

DOI: 10.14313/JAMRIS_4-2013/43

Abstract:�n this paper we inves�gate methods �or sel�-locali�a�ono� a walking robot with the �inect �� ac�ve range sen-sor. The �tera�ve �losest �oint (��) algorithm is consid-ered as the basis �or the computa�on o� the robot rota-�on and transla�on between two viewpoints. As an alter-na�ve, a �eature-basedmethod �ormatching o� �� rangedata is considered, using the Normal Aligned Radial Fea-ture (NARF) descriptors. Then, it is shown that NARFs canbe used to compute a good ini�al es�mate �or the �� al-gorithm, resul�ng in convergent es�ma�on o� the sensoregomo�on. ��perimental results are provided.

Keywords: �� percep�on, salient �eatures, itera�ve clos-est point, visual odometry, walking robot

�� trod�c�o�In recent years the research on Simultaneous Lo-

calization and Mapping (SLAM) in robotics heads to-wards more challenging scenarios, mostly in three-dimensional (3D) environments. Mobile robots mov-ing in 3D require three-dimensional positioning, withregard to six degrees of freedom (DOF). The 6-DOF pose contains the position in three dimen-sions, orientation, and pitch and roll angles: xR =[xr yr zr θr ϕr ψr]

T . Self-localization with regard to6-DOF is particularly important in the case of walk-ing robots, because the discrete nature of theirmotioncauses sudden and frequent changes in the robot’strunk roll, pitch and yaw angles with regard to theglobal reference frame [2].

Various sensing modalities can yield the data nec-essary for 6-DOF SLAM or self-localization. RecentSLAM algorithms focus on the use of passive visionsensors (cameras) [20]. However, passive vision re-quires speci�ic photometric features to be present inthe environment. Encountering areaswith ill-texturedsurfaces and few visually salient features the systemsrelying on passive vision usually cannot determine thepose of the robot, and easily get lost. Therefore, activerange sensors seem to be more practical solution formany mobile robots, particularly those, which have towork in highly unpredictable environments, such likedisaster sites explored by robots during search andrescue missions.

Often point clouds yielded by 3D laser scannersare matched against each other in order to obtain thedisplacement between two poses of the sensor. Thisapproach, known as scan matching, is characterizedby high reliability and relative independence from the

characteristics of the environment, because in scanmatching dense point clouds arematched against eachother, not against a particular type of features. Thescan matching procedure is typically implementedwith the Iterative Closest Points (ICP) algorithm. How-ever, implementation of 6-DOF scan matching withdata from laser scanners inmost of thewalking robotsis hardly possible because of the tight size, mass, en-ergy and computing resources limits in such robots –there are no 3D laser scanners that could be placedon board of a typical walking robot. A solution to thisproblem would be to use a compact 3D range sen-sor, which has no moving parts. In this context aninteresting device is the integrated 3D sensor devel-oped and patented by PrimeSense [21], then intro-duced to themarket by Asus as Xtion and byMicrosoftas Kinect. These sensors are compact and affordable,and as shown by the recent research [31], the rangemeasurement accuracy up to the distance of about 3.5m is comparable to that achieved by a 3D laser scan-ner.

This paper addresses the issues of reliable in-cremental self-localization for the six-legged walkingrobot Messor equipped with the Kinect sensor. Onlythe rangedata fromKinect areused, becausewewouldlike to have a procedure that is portable to other 3Drange sensors, such as the VidereDesign STOC (STereoOn a Chip) camera that yields a dense depth map inreal-time using hardware-embedded image process-ing. This camera was already successful used on theMessor robot for terrain perception [15]. Anothermo-tivation for using only the rangedata fromKinect is thefact, that the Messor robot is designed for patrol andsearch missions in buildings [32], and it should notmuch rely on the natural lighting of the scene, whichin such an application scenario may be insuf�icient.

We evaluated performance of the standard ICP al-gorithm on Kinect range data in tests that simulateda scenario typical to a walking robot traversing roughterrain. Because these tests revealed that matchingconsecutive point clouds obtained from Kinect by us-ing the ICP algorithm requires a good initial estimateof the robot displacement we propose a two-stagepoint cloudmatching procedure exploiting salient fea-tures in the range data. The feature-based matchingprocedure employs the NARF detectors/descriptors[29] and yields reliable initial alignment, which is thenre�ined using the ICP.

This work is an extension of our previous confer-ence papers [16, 17]. The major difference is evalu-ation of the proposed approach on more realistic in-

43


Articles 43


door datasets acquired with the Messor robot, and ona publicly available dataset. Moreover, presentation ofthe state of the art has been signi�icantly extended. Inthe remainder of this paperwe brie�ly review the stateof the art (section 2), then we present the concept ofthe incremental self-localization system for the Mes-sor robot with Kinect (section 3). Next, we examinetheoretical foundations of the point cloudmatching al-gorithms (section 4), and in section 5 we show howthe sought rotation and translationbetween twoview-points can be computed robustly. We demonstrate ex-perimentally the performance of the proposed self-localization system, at �irst using an industrial manip-ulator to move the sensor, and then on the actual Mes-sor robot (section 6).We conclude the paper in section7with some remarks as to the directions of further re-search.

2. State of the artNowadays passive vision is considered the most

affordable exteroceptive sensing modality for mobilerobots, and vision systems receive much attention inthe SLAM research. In particular, the monocular EKF-based SLAM algorithm [6] with its numerous exten-sions is considered one of the most successful so-lutions to the 6-DOF self-localization problem. How-ever,monocular vision provides only bearing informa-tion without a range to the detected features, whichleads to data association problems. Although it wasdemonstrated that a real-time, monocular SLAM canbe implemented on a walking robot [28], in a searchand rescue scenario, which is considered for our Mes-sor robot, the machine often explores new areas, andavoids returning to previously visited places. There-fore a self-localization system that is rather a form ofvisual odometry with some mapping capabilities, likein [30], seems tobemoreappropriate for this task thana SLAM algorithmmaintaining a global map of the en-vironment.

Active sensors, such like laser scanners and 3Drange cameras do not depend so much on both thenatural illumination of the scene, and the ubiquity ofsalient photometric features. As it was shown in ourprevious work [27] the pose of a walking robot can beprecisely estimated by using 2D laser scan matchingand data from an inertial measurements unit. How-ever, this approach works only in man-made environ-ments, where �lat walls are available. In an unstruc-tured environment, lacking such features like long �latwalls, point clouds yielded by a 3D laser scanner canbe used for self-localizationwith regard to 6-DOF [18],but a 3D laser scanner cannot be used on Messor dueto the mass and size limits.

Recently, a number of research teams demon-strated the use of Kinect/Xtion sensors for self-localization or SLAM. The 3D range data stream ob-tained from a Kinect-like sensor is very intensive.Therefore, the ICP algorithm, which tries to establishcorrespondences between all the available points re-quires much time for each iteration. Such amount ofdata can be processed using a parallel implementa-

tion of ICP on GPU [19]. Also the approach presentedin [11] heavily uses the GPU to run the ICP algorithmon the Kinect range data. It is demonstrated that thismethod is enough for real-time tracking of the sen-sor with unconstrained motion, but this solution isnot available for on-board processing inMessor,whichuses a single-core x86 processor.

The amount of data being processed can be re-duced substantially if salient features are detectedin the range or photometric data obtained from theKinect sensor. Endres et al. [8] demonstrated recentlya SLAM system using Kinect RGB and range informa-tion. This solution relies mostly on the photometricdata for salient feature detection, and then uses thedepth image to locate the keypoints in 3D space.Whilethis system shows good performance on large datasets it is unclear how the solution based on photomet-ric features (SIFT, SURF, ORB) will work on the walk-ing robot, which has the sensor pointed partially to-wards the ground, where the number of photometricfeatures can be much smaller. This system needs alsosmall displacements between the consecutive frames.Therefore, the RGB-D images from Kinect have to beprocessed at the frame rate, which is impossible withthe on-board computer of our robot. Also in [10] fea-tures extracted from PrimeSense camera’s RGB dataare used. Feature matching initializes combined fea-ture and ICP optimization. The RGB-D ICP algorithmproposed in [10] uses photometric features and theirassociated range values to obtain an initial alignment,and then jointly optimizes over the sparse featurematches and dense 3D point correspondences.

Range-only features are used in [23] by an ini-tial alignment algorithm that transforms the 3D pointclouds to the convergence basin of an iterative reg-istration algorithm, such as ICP. In this case persis-tent feature histograms are used. They are scale andpose invariant multi-value features, which generalizethe mean surface curvature at a point. The approachis shown to work indoors and outdoors, but only withrather good quality data from mechanically scanning3D laser sensors.

3. Concept of the Kinect-based self-locali�a�on s�ste�

3.1. Kinect as a sensor for the walking robot

The 3D ranging technology used in Kinect ispatentedbyPrimeSense, and thus fewdetails concern-ing the characteristics of the sensor are publicly avail-able. Therefore, we present a short overview of theprinciple of operation and basic uncertainty charac-teristics of Kinect, which are relevant to our applica-tion.

Kinect is a device consisting of several compo-nents, and is intended to be used for natural interac-tion with computer games. The basic functionality ofthe Kinect sensor allows to obtain color images (RGB)and range images of 640×480 resolution,which gives307200 points in 3D space. By combining informa-tion from the range image and the RGB image theRGB-D image can be obtained, in which RGB values

44


Articles44


are allocated to the points in the 3D space. The rangemeasurements are obtained by the principle of trian-gulation, using structured light. The laser projectoroperates in the infrared range, with the wavelengthof about 830 nm. The laser beam passes through adiffraction grating to form a constant pattern of spotsprojected onto the observed surfaces. The image ofthis pattern is captured by the CMOS infrared cam-era, and comparedwith a stored reference pattern, ob-tained by projecting the spots onto a plane located ata known distance from the sensor (Fig. 1).

disparity

CM

OS

base

x

z

IR�LASER

IR�CAMERA

SP

OT

SA

S�S

EE

N�B

YA

N

EX

TE

RN

AL

IR�C

AM

ER

A

RE

FE

RE

NC

E�S

UR

FA

CE

Fig. �. �rinciple of opera�on of the structured light sen�sor. Inset image shows the actual light spots projectedby Kinect on a flat surface

Spatial density of the measured points is de�inedby the number of projected spots per unit area. It de-creases with the square of the measured distance. Forthe measured range of 2 m the distance between thedots is about 3mm.Because the number of spots in theprojected pattern is smaller than the number of pixelsin the captured infrared image, somepixels have inter-polated values. The depth resolution decreases withthe increasing distance to the sensor. For the mea-sured range of 2 m the depth resolution is about 1 cm,and falls exponentially for longer distances. Kinect isalso very prone to errors due to environmental condi-tions, especially the ambient light. Too strong ambientlight, and the sunlight in particular, reduces the con-trast of the infrared spots in the captured image, oftenrendering the range measurements impossible.

d(4

6�c

m)

zk

�=27o

YR

ZR

Fig. �. �eometric con�gura�on of the Kinect�based per�cep�on system for the Messor robot

On the walking robot Messor the Kinect sensor ismounted on a mast (Fig. 2). The sensor is located atthe elevation of 46 cm above the ground, assuming

the neutral (default) posture of the robot. Since themain task of Kinect in this con�iguration is to observethe area in front of the robot, it was tilted down by27◦. This con�iguration of the perception system al-lows both to measure the shape of the terrain in frontof the robot, and to perceive larger objects being lo-cated farther from the sensor (Fig. 3). These objectsare important for the self-localization task, becausethey often provide more salient features than the ter-rain itself.

Fig. 3. Messor walking robot with the Kinect sensor

The most important limitation imposed on theself-localization method by the resources available onour walking robot is the processing speed. The single-board PC used by the robot cannot process the rangedata at full frame rate, no matter which algorithm isapplied for matching of point clouds. Thus, 3D pointscanbe acquiredonly at selected locations. On theotherhand, the ICP with Kinect 3D data is effective only forsmall values of translation (a few centimeters) androtation (up to about 5◦) between the consecutivepoint clouds, as it was shown in our preliminary ex-periments (see Section 6.1). For bigger translationsand�or rotations the ICP algorithmusually only �inds alocal minima. In ICP scanmatching on wheeled robotsthis problem is usually alleviated by obtaining a fairlygood initial estimate of the displacement from odom-etry, but in walking robots proprioceptive sensing isunreliable [9].

Therefore, we are looking for a self-localizationsystem, which uses range data and can process iso-lated point clouds, tolerating larger displacements(both translations and rotations) between the pointsof data acquisition. Moreover, we would like to avoiduse of any additional hardware (such like a GPUboard), which is not available to the Messor robot.Finally, we decided to use feature-based point cloudmatching to �ind a better initial guess for the ICP algo-rithm. The implementation uses the open source PointCloud Library (PCL) [24] to manipulate the 3D points,and to detect the salient features. Also the implemen-tation of ICP is taken from the PCL.

The whole procedure of self-localization is imple-mented in several steps. At �irst, the range data aretrimmed to discard themeasurements shorter than50cm and longer than 350 cm, that is unreliable or of low

45


Articles 45


depth resolution. Then, the VoxelGrid �ilter from thePCL library is used on both data sets with the voxelsize of 2 cm. This �ilter reduces the number of pointsand decreases the in�luence of small range measure-ment errors. We do not use other �ilters available inthe PCL, as their use did not improve the results, whileit consumed additional time for processing. NARF key-points are detected in both point clouds under consid-eration, and then their descriptors are computed. Thenext step is matching of the descriptors as describedin Section 4.2. The kinematic constraints are imposedin this step. If a suf�icient matching is found, the trans-formation between the two point clouds is estimated,and the data set A is transformed to the new coordi-nates. Then, the standard ICP procedure from the PCLlibrary is applied to the twodata sets in order to obtaina more precise transformation.

4. Methods for range data matching4.1. The ICP algorithm

There are many variants of the ICP algorithm de-scribed in the literature. The basic version was pro-posed in [4], while the improvements introduced laterinvolve speeding up the algorithm [22], improving itsrobustness to noise in range data, and improving ro-bustness to wrong initial guess of the transforma-tion [25]. In this paper, the standard ICP algorithmis used, in the implementation available in the PCLlibrary. This choice makes it possible to asses howmuch the proposed feature-based matching methodimproves the self-localization with regard to the base-line method.

Generally, the ICP algorithm is aimed at match-ing of two sets of spatial data (usually represented bypoints) describing the same object or part of a scene,and then to represent these sets in a common coor-dinate system. In each iteration, the ICP algorithm se-lects the closest points (Euclidean distance) in bothclouds as the matching points. For matching points,the rotation R and translation t are computed, whichminimize the criteria:

( R, t) = argminR, t

nA∑i=1

nB∑j=1

wi,j∥ pAi− ( R pBj

+ t)∥2 ,

(1)where nA and nB are the numbers of points in thecloudsA and B, respectively, wi,j are weights de�inedas wi,j = 1 when points pAi

and pBjare closest

neighbors orwi,j = 0otherwise. Assuming that pointsare matched correctly, the transformation ( R, t) iscomputed by using the SVD method.

When the transformation that minimizes (1) iscomputed, the A set of points is transformed to thenew position and new correspondences of points areestablished. Due to the search for the closest neigh-bors the computational complexity of ICP is quadraticwith regard to the number of points in the two clouds.A more effective way of searching neighbors can beachieved using k-d trees to represent the data.

It is necessary to introduce a limit for the max-imum Euclidean distance between neighbors, be-

cause the clouds may not represent the same part ofthe scene, and unlimited search for correspondencescould yield a completely wrong spatial transforma-tion. This limit, expressed by the dmax distance, pre-vents the algorithm frommatching points that are toofar away from each other. The ICP algorithm is onlyguaranteed to converge to a localminima, andmay notreach the global one [4]. Because of this, it is extremelyimportant to have a good initial guess for the sought( R, t) transformation.

4.2. Feature-based matching of range data

An alternative approach to matching 3D data ex-ploits salient features of various types. This approachis similar to matching 2D photometric images [14]. Itis based on �inding some characteristic keypoints inboth data sets, then describing them with unique de-scriptors, and �inding the matching between the com-puted descriptors. These features are salient points,which do not represent pre-speci�ied structures, andcommonly appear in both man-made and natural en-vironments.

Because we are interested in the range data fromKinect, which we consider to be a richer representa-tion of the environment than the photometric (RGB)data, we choose the Normal Aligned Radial Feature(NARF) detector/descriptor for our implementationof the feature-based point cloud matching procedure.The NARF concept was introduced for object recog-nition from 3D range data [29]. The NARF detectorsearches for stable areas with signi�icant changes inthe neighborhood, that can be keypoints looking sim-ilar from various viewpoints. Detection is done by de-termining the plane minimizing the mean square er-ror between the processed point, and other pointscontained in the prede�ined sphere around this point.When themean square errordoesnot exceed the giventhreshold, the tested point is considered to be a NARFkeypoint. The most important parameter of the NARFdetector is the diameter of the sphere around the pro-cessed point. This sphere contains points, which dom-inant directions are used to determine the �coef�icientof interest”, and then to detect the keypoints.

Then, descriptors are built upon the areas aroundthe keypoints. The NARF descriptor characterizes thekeypoint by determining depth changes in each di-rection, calculating an aligned range value patch, andidentifying the dominant orientation of this patch.Thus, descriptors are the normalized values of gradi-ent in n directions with regard to the keypoint. Thenumber of directions is also the length of the NARFdescriptor, here n=36 is used. The number of NARFdescriptors in a dataset depends on the environmentcharacteristics, but in some cases hundreds of descrip-tors in one point cloud can be established.

�. Transforma�on between two �oint cloudswith salient featuresOnce the keypoints and NARF descriptors are de-

termined in both point clouds under consideration it

46


Articles46


is necessary to compute the rotationR and translationt between these two data sets.

In this step of the self-localization method the in-put data consists of a point setA (the �irst cloud) anda point setB (the second cloud). The set of all possiblepairs of descriptors is de�ined as Z . For each descrip-tor of the �irst point cloud all possible pairingswith thedescriptors of the second cloud are established. Forthe set Z the matchings are de�ined by the descrip-tor dissimilarity function, which in our application isgiven as:

fk = minj=0,1,...,mB−1

36∑i=1

|pk[i]− qj [i]|, (2)

where fk is the value of dissimilarity, which is the sumof absolute differences between 36 values describingthe k-th descriptor of the cloudA, and the correspond-ing values of thedescriptor of the cloudB. Thedescrip-tor of the cloud B is chosen to minimize the value offk . In equation (2) pk[i] is the i-th value of the k-th de-scriptor of the �irst point cloud, while qj [i] is the i-thvalue of the j-th descriptor of the second point cloud,and mB is the number of descriptors of the secondcloud. Then again, all pairings of the given descrip-tor of A are investigated, and the pairs for which thedissimilarity value is greater than 3 times fk are dis-carded as incorrect.

Unfortunately, among the accepted pairing therecould be incorrect correspondences, minimizing (2),but leading to a clearly incorrect estimate of the ego-motion. To solve this problem we exploit the fact, thatthe walking robot is a physical system, having limitedmotion capabilities with regard to acceleration, veloc-ity, and direction of motion. Taking this into account itis possible to introduce a set of constraints on themax-imum translations and rotations with regard to theparticular axes of the robot’s coordinate frame. Theseconstraints are used to discard pairs of the descriptorsfor which the distance between their keypoints is big-ger than the maximum distance:

ddes =√d2desX + d2desY + d2desZ , (3)

where ddesX is themaximumdistance in xr axis, ddesY isthe maximum distance in yr axis, and ddesZ is the max-imum distance in yr axis. Formula (3) is simpli�ied inorder to allow fast computations, and does not take ex-plicitly into account rotation of the robot, but for thelimited range of Kinect measurements it is a reason-able approximation, as we do not expect to have key-points on objects located far away. This condition al-lows to discard most of the incorrectly matched pairsfrom the Z set.

However, the set of matched NARF pairs still cancontain some outliers. Therefore, the point cloudtransformation algorithm is applied within the robustestimation framework based on the RANSAC scheme.In each of its iterations RANSAC de�ines a subsetN ofthe set Z , which contains 3 unique descriptor pairs.The next step of the algorithm computes the transfor-

mation between those pairs:

argminT

∥ T P− Q∥, (4)

where Tmeans the transformation combined of rota-tion R and translation t, while P and Q are matricesbuilt from the coordinates of keypoints as follows:

P =

x11 y11 z11x12 y12 z12x13 y13 z13

, (5)

Q =

x21 y21 z21x22 y22 z22x23 y23 z23

, (6)

where x1i, y1i, z1i are the coordinates of the i-th key-point from the cloud A, belonging to the set N , whilex2j , y2j , z2j are the coordinates of the j-th keypointfrom the cloud B, also belonging to the setN .

The 3D rigid body transformation that aligns twosets of points for which correspondence is known canbe computed in several ways. In [7] four algorithmsare compared, each of which computes the rotation Rand translation t in closed form, as the solution to aformulation of the problem stated by (4). The resultsof comparison presented in [7] show that for the al-gorithms under study there is no difference in the ro-bustness of the �inal solutions, while computing ef�i-ciencymainly depends on the implementation. There-fore, to solve (4) we chosen the algorithm based onSingular Value Decomposition (SVD) computation us-ing the standard ( R, t) representation of the transfor-mation. Although this method has been for the �irsttime proposed byKabsch [12] often amuch laterworkon computer vision [1] is cited as the origin.

After computing the transformation Tnewcoordi-nates for the points of the cloudA are calculated. Thenext step involves computing the matching error:

ϵ =

nA∑i=1

( hi− dj)2 for j = argmin

j=0,1,...,nB−1( hi− dj)

2,

(7)where hi is the vector of coordinates of i-th keypointof the point cloud A transformed by T, dj is the vec-tor of coordinates of the j-th keypoint of the cloud B,which is the closest to hi. When the value of ( hi −dj)

2 is greater than a �ixed parameter, the keypoint istreated as an outlier and the computed distance is notcounted in (7).

The necessary condition of stopping RANSAC isachieving the error ϵ below a preset value, or not im-proving the best model for a �ixed number of itera-tions, or exceeding the number ofmaximum iterations[17]. Alternatively, in the new version of our softwarethe necessary number of RANSAC iterations can be es-timated using a simple probabilistic model [5], whichslightly improves on the computing time.

WhenRANSAC is �inished, the transformationwiththe minimal error is performed on all points of thecloud A. This method transforms the point clouds tothe convergence basin of the ICP algorithm, thus im-proving the chance of �inding the globalminimum. The

47


Articles 47


points belonging to cloud A are transformed to thenew coordinates by the formula:

p′A[i] = R pA[i] + t for i = 1, 2, ..., nA (8)

where pA[i] means the i-th point of the �irst pointcloud. The newly computed set A′ is then the pointcloud for the ICP algorithm searching for a more pre-cise transformation ofA′ into B.

6. ��erimental e�al�a��n �� t�e met��6.1. Preliminary tests

Testing Kinect as a sensor for walking robot self-localization we conducted several experiments toevaluate robustness of the basic ICP algorithmand ourNARF-based method for initial alignment of the pointclouds. These initial tests were performed on pairsof point clouds obtained for various data samplingrates along the path of the sensor motion. First testswere aimed at determining the approximate maxi-mum translational and rotational distance for whichtwo Kinect point clouds can be successfully aligned bythe ICP algorithm. Then, we tested if these limits canbe relaxed by applying the NARF-based method.

a

b

c

d

Fig. 4. Experiment with the industrial robot: terrainmockup as seen by Kinect (a), point clouds with NARFsprior to matching (b), matching result with ICP (c),matching result with NARF+ICP (d)

Experiments were carried out using the Kuka KR-200 industrial robot, which allowed us to obtain

ground truth for the sensor’s trajectory. The Kinectsensor was moved over a mockup of rocky terrain.The sensorwas attached to the robot’s wrist and tilteddown at an angle of 27◦, as in the con�iguration usedon thewalking robot. An RGB-D rendered image of theexperimental setup is shown in Fig. 4a. The process-ing times mentioned further in the paper were mea-sured under Linux on a low-end PC notebook with anIntel Core2duo T6500 2.1GHz processor and 4GB ofRAM. Although the notebook had a dual-core proces-sor the single-thread program didn’t use this feature,effectively running in a single core con�iguration, sim-ilar to the one used on the actual Messor robot.

In the experiment shown in Fig. 4 the robot’send effector moved 20 cm along a straight line. Twodatasets were obtained at the beginning, and at theend of the trajectory (Fig. 4b). The �irst dataset (A) isdepicted by darker points, while the second one (B) isshown in lighter shade of grey. The small squaresmarkNARF keypoints for each of the point clouds, while de-scriptors that were used to determine the transfor-mation between the clouds are marked with largersquares.

a

b

Fig. �. Rota�onal mo�on o� the sensor: point cloudswith NARFs prior to matching (a), matching result withNARF+ICP (b)

Results of this simple test have shown that for thedisplacement of 20 cm the standard ICP algorithmwasunable to �ind any satisfying estimate of the egomo-tion of the sensor (Fig. 4c). It was not possible de-spite the fact that we have tested various values ofthe dmax parameter and allowed for thousands of iter-ations of the algorithm. In our experiments for largeinitial displacements between the point clouds the ICPalgorithm usually converged to a local minima, yield-ing a wrong egomotion estimate. However, the NARF-based matching algorithm was able to �ind a good ini-tial transformationof theAdataset, located in the con-vergencebasin of the ICPalgorithm.Thus, the ICPusedas the second step of the self-localization method wasable to �ind an acceptable estimate of the displace-ment: 17.3 cm, and a good alignment of the two clouds

48


Articles48


(Fig. 4d). The time required by particular parts of theregistration procedurewas: detection and descriptionof the NARFs 0.84 s, matching with the RANSAC pro-cedure 0.25 s, and then the ICP algorithm 0.27 s.

A similar experiment was performed with the ro-tational motion of the sensor. The robot wrist was ro-tated by 10◦ between two poses of point cloud ac-quisition (Fig. 5a). Again, the standard ICP could not�ind a satisfying transformation between the clouds.The initial displacement found by the NARF-basedmethod was quite imprecise – a rotation by 4.7◦ andsmall translations along all axes. However, this initialtransformation turned out to provide a good enoughstarting point for the ICP algorithm, transforming theclouds to its convergence basin. As the result of thetwo-step registration procedure a good alignmentwasachieved (Fig. 5b), with the rotation value close to theactual 10◦.

a b

c d

Fig. 6. Results for the publicly available dataset: a viewof the environment from the Kinect camera (a), pointclouds with NARFs prior to matching (b), NARF-basedini�al alignment (c), matching result with NARF�� (d)

Unfortunately, we do not have a motion capturesystem that can provide reliable ground truth for en-vironments more extended than the relatively smallmockup used with the industrial robot. Therefore, weevaluated the performance of our self-localization sys-tem also on a publicly available dataset1. This dataset,which contains RGB-D data from Kinect, and time-synchronized ground truth poses of the sensor ob-tained from a motion-capture system was recentlyused to evaluate a SLAM algorithm [8]. Figure 6ashows an example RBG image taken from this dataset,while Fig. 6b presents two point clouds taken fromthis sequence (the �irst one is synchronized with theshown RGB image) with detected NARF keypoints. Asit can be seen in Fig. 6c the NARF-basedmatching pro-cedure provides quite good initial alignment of thetwo clouds, which is however further re�ined by theICP method (Fig. 6c). Quantitative results (in meters)for this experiment are presented in Tab. 1. For thisdataset the detection and description of NARFs took0.735 s, and the estimation of initial alignment usingRANSAC required 0.291 s.

Displacement [m] ∆xr ∆yr ∆zrGround truth 0.040 -0.028 -0.011NARF-based alignment 0.069 -0.051 -0.076NARF+ICP estimate 0.046 -0.048 -0.040

�ab. �. �s�mated egomo�on for the publicly availabledata

6.2. Indoor tests on the Messor robot

Next, the proposed method was veri�ied on theMessor robot. Because we do not have any externalmotion capture system to measure the pose of therobot during the experiments, we compare the re-sults of the Kinect-based self-localization to the re-sults obtained from another self-localization system –Parallel Tracking and Mapping (PTAM) software [13],which uses monocular vision data from another cam-era mounted on Messor. The PTAM system offers pre-cise tracking of the camera pose, andwas already eval-uated on the Messor robot with positive results [3].However, PTAM works well only in environments richin photometric features, and is limited to rather smallworkspaces. Therefore, PTAM was unable to provideself-localization results for longer trajectories, and wehave decided to show here only results of pair-wisematching of two point clouds, similarly to the the in-dustrial robot experiments.

a b

c d

Fig. 7. Results for the Messor robot traversing a sim-ple mockup: a view from the Kinect camera (a), pointclouds with NARFs prior to matching (b), NARF-basedini�al alignment (c), matching result with NARF�� (d)

Although Messor does not have reliable odometry,the knowledge of the motion direction and the esti-mated length of the stepwas used to impose kinematic

49


Articles 49


constraints on the egomotion estimates computed bythe self-localization algorithm. The constraints are notso tight: 50 cm translation along each axis, and 30◦ ro-tation around each axis are allowed. These values arenot violated even by sudden and unpredictable mo-tions related to slippages of the walking robot [2].They simply re�lect the fact, that the robot cannot be“teleported” during the relatively short time betweenthe two moments of range data acquisition.

Displacement [m] ∆xr ∆yr ∆zrPTAM results 0.043 0.188 0.003NARF-based alignment 0.047 0.186 0.069NARF+ICP estimate 0.039 0.187 0.007

�ab. �. �s�mated egomo�on for the simple moc�up e��periment

The �irst dataset was obtained for translationalmotion of the Messor robot on a simple terrainmockup, which is visible in Fig. 7a. This environmentcontained few larger geometric features, so the NARFdescriptors were mostly located on the borders ofthe mockup and on more distant objects, outside themockup (Fig. 7b). Table 2 provides the egomotion es-timation results for two versions of the range-data-based point cloud registration procedure: the initialtransformationwith NARF keypoints (Fig. 7c), and thetwo-step procedure applying the ICP algorithm to theinitially aligned point clouds (Fig. 7d). These resultsare compared to results provided by the PTAM sys-tem, which could be considered very accurate in thisexperiment, because the surface of the mockup is richin photometric features. The NARF detection, descrip-tion, and computation of the initial alignment required1.055 s, while the whole self-localization procedurebetween two point clouds, with theNARF-based align-ment and ICP took 1.543 s on a single-core PC.

A similar experiment was conducted in one of cor-ridors in our building. This corridor is ratherwide andcontains some furniture, as seen in Fig. 8a. The camerayielding images for the PTAMalgorithm ismounted onMessor similarly to theKinect sensor – it is tilted downin order to see the surface in front of the robot. There-fore, PTAM in our application relies mostly on photo-metric features found on the ground. In the corridorthe �loor is very dark, and fewnatural photometric fea-tures can be found here, which makes PTAM quite un-reliable in this environment. Therefore we put somesmall stones on the �loor to increase the number offeatures for PTAM. Table 3 provides results for the ini-tial NARF-based transformation (Fig. 8c), and the fullNARF+ICP version of the proposed method (Fig. 8d).In this case the NARF detection, description, and com-putation of the initial transformation took 1.756 s, be-causemore RAN�AC iterations were needed to �ind anacceptable initial estimate of the egomotion. This wasin turn caused by small number of NARFs that werefound on the �loor. The whole self-localization proce-dure including ICP required 2.315 s. However, theseresults are still acceptable for our application, as the

a b

c d

Fig. 8. Results for theMessor robotmoving in a corridor:a view from the Kinect camera (a), point clouds withNARFs prior to matching (b), NARF�based ini�al align�ment (c), matching result with NARF+ICP (d)

walking robot needed four seconds to cover the dis-tance of about 20 cm, using a tripod gait and makingsteps of the length of 5 cm. Thus, our system was ableto complete all self-localization steps between the twoconsecutive data acquisition events.

Displacement [m] ∆xr ∆yr ∆zrPTAM results -0.027 0.217 0.016NARF-based alignment -0.468 0.246 0.021NARF+ICP estimate -0.042 0.244 0.018

�ab. �. �s�mated egomo�on for the corridor e�peri�ment

6.3. Outdoor tests on the Messor robot

In order to further verify the Kinect applicability towalking robot self-localization, experiments were car-ried out also in outdoor environment (Fig. 9a). Whenthe sensor was exposed to direct sunlight, it was notpossible to get any usable range measurements. How-ever, it was possible to use Kinect for self-localizationin the shadeof a buildingwall. �ecause of the in�luenceof sunlight (even in a shaded place) the obtained pointclouds had only about 30% of the number of pointstypically obtained indoors (Fig. 9b). During the exper-iment, the robot moved forward, roughly along its yaxis, on slightly uneven ground. The reduced numberof points turned out to be suf�icient for the proposedalgorithm. This is due to the fact that the visible (mea-sured) areas are almost the same for both point clouds(Fig. 9c). Thus, the algorithmhas found the correct dis-placement in spite of the environment conditions be-ing dif�icult for the Kinect sensor (Fig. 9d). The time

50


Articles50


required to �ind the NARFs and build the descriptorswas only 0.15 s, obviously due to to the small num-ber of points. The RANSAC procedure took 0.28 s –this value is quite stable across all our experiments. Fi-nally, the ICP procedure required only few iterationsto converge from a very good initial guess, thus thetime was only 0.06 s.

a

b

c

d

Fig. 9. Outdoor experimental setup (a), outdoor sceneas seen by Kinect (b), two point clouds prior to match-ing with NARF correspondences (c), results of matchingwith NARF+ICP (d)

7. ConclusionsThis paper presented a two-step range data regis-

tration method, which is suitable for the inexpensiveKinect 3D range sensor mounted on a walking robot.

The experiments have shown that the use of NARFdescriptors matching brings the point clouds close tothe correct position even for relatively large transla-

tions and rotations between the viewpoints. This pro-cedure provides a good initial estimate of the ego-motion, which is within the convergence basin of theICP algorithm. In turn, the ICP algorithm allows pre-cise alignment of the two point clouds, using all therange information. The combinationof these twostepsyields a robust estimate of the robot’s egomotion invarious environments, evenundermoderately varyinglighting conditions.

Further studies will be carried out to acceleratethe ICP implementation, and to fully exploit the com-bined RGB-D data. Using the range and photomet-ric information the robot should be capable of self-localizing in both featureless areas (e.g. long corri-dors), and poorly illuminated areas. Implementationof a full scan-matching-based visual odometry systemfor the walking robot is also a matter of further re-search.

Notes1Data downloaded from:

http://vision.in.tum.de/data/datasets/rgbd-dataset

AUTHORSMichał Nowicki – Poznań University of Technol-ogy, Institute of Control and Information Engineer-ing, ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail:[email protected] Skrzypczyński∗ – Poznań University of Tech-nology, Institute of Control and Information Engineer-ing, ul. Piotrowo 3A, 60-965 Poznań, Poland, e-mail:[email protected].∗Corresponding author

REFERENCES[1] K. S. Arun, T. S. Huang, S. Blostein, ”Least-squares

�itting of two 3-D point sets”, IEEE Trans. on Pat-tern Analysis and Machine Intelligence, vol. 9, no.5, 1987, pp. 698–700.

[2] D. Belter, P. Skrzypczyński, ”Rough terrain map-ping and classi�ication for foothold selection ina walking robot”, Journal of Field Robotics, 28(4),2011, pp. 497–528.

[3] D. Belter, P. Skrzypczyński, Precise self-localization of a walking robot on rough terrainusing PTAM, in: Adaptive Mobile Robotics (N.Cowan et al., eds.), Singapore, World Scienti�ic2012, pp. 89–96.

[4] P.J. Besl, N.D. McKay, ”A method for registra-tion of 3-D shapes”, IEEE Trans. on Pattern Anal-ysis and Machine Intelligence, vol. 14, 2, 1992,pp. 239–256.

[5] S. Choi, T. Kim,W. Yu, ”Performance evaluation ofRANSAC family”, In: Proc. British Machine VisionConference, London, 2009.

[6] A. Davison, I. Reid, N. Molton, O. Stasse,”MonoSLAM: Real-time single camera SLAM”,

51


Articles 51


IEEE Trans. on Pattern Analysis and MachineIntelligence, vol. 29, no. 6, 2007, pp. 1052–1067.

[7] D.W. Eggert, A. Lorusso, R.B. Fisher, ”Estimating3-D rigid body transformations: a comparison offour major algorithms”, Machine Vision and Ap-plications, vol. 9, no. 5–6, 1997, pp. 272-290.

[8] F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cre-mers, W. Burgard, ”An evaluation of the RGB-DSLAM system”, in: Proc. IEEE Int. Conf. on Robot.and Automat., St. Paul, 2012, pp. 1691–1696.

[9] B. Gaßmann, J. M. Zöllner, R. Dillmann, ”Naviga-tion of walking robots: localisation by odome-try”. In: Climbing andWalking Robots VIII, Berlin,Springer 2005, pp. 953–960.

[10] P. Henry, M. Krainin, E. Herbst, X. Ren, D. Fox,”RGB-D mapping: Using Kinect-style depth cam-eras for dense 3D modeling of indoor environ-ments”, Int. Journal of Robotics Research, vol. 31,no.5, 2012, pp. 647–663.

[11] S. Izadi et al., ”KinectFusion: real-time 3D recon-struction and interaction using a moving depthcamera”, ACM Symp. on User Interface Softwareand Technology, New York, 2011, pp. 559–568.

[12] W. Kabsch, ”A solution of the best rotation to re-late two sets of vectors”, Acta Crystallographica32:922, 1976.

[13] G. Klein, D. Murray, ”Parallel tracking andmapping for small AR workspaces”, In:Proc. Int. Symp. on Mixed and AugmentedReality, Nara, 2007, pp. 225–234.

[14] D.G. Lowe, ”Distinctive image features fromscale-invariant keypoints”, Int. Journal of Com-puter Vision, vol. 60, 2, 2004, pp. 91–110.

[15] P. Łabęcki, P. Skrzypczyński, ”Real-time visualperception for terrain mapping in a walkingrobot”, In: Adaptive Mobile Robotics (N. Cowanet al., eds.), Singapore, World Scienti�ic 2012,pp. 754–761.

[16] M. Nowicki, P. Skrzypczyński, ”Experimental ver-i�ication of a walking robot self-localization sys-tem with the Kinect sensor”, Prace NaukowePolitechniki Warszawskiej – Elektronika. Postępyrobotyki, vol. 182, Warsaw, 2012, pp. 561–572(in Polish).

[17] M. Nowicki, P. Skrzypczyński, Robust registra-tion of Kinect range data for sensor motion es-timation, in: Computer Recognition Systems 5 (M.Kurzynski, M. Wozniak, eds.), Berlin, Springer-Verlag 2013. (in print)

[18] A. Nüchter, K. Lingemann, J. Hertzberg, H. Sur-mann, 6D SLAM – 3D mapping outdoor environ-ments, Journal of Field Robotics, 24(8–9), 2007,pp. 699–722.

[19] A. Nüchter, S. Feyzabadi, D. Qiu, S. May, SLAMà la carte – GPGPU for globally consistent scan

matching, in: Proc. 5th European Conf. on MobileRobots (ECMR), Örebro, 2011, pp. 271–276.

[20] L. Paz, P. Piniés, J. D. Tardós, J. Neira, Large-scale6-DOF SLAM with stereo in hand, IEEE Trans. onRobotics, vol. 24, no. 5, 2008, pp. 946–957.

[21] PrimeSense, 2010,http://www.primesense.com

[22] S. Rusinkiewicz, M. Levoy, ”E�icient variants ofthe ICP algorithm”, in: Proc. 3rd Int. Conf. on3D Digital Imaging and Modeling, Quebec, 2001,pp. 145–152.

[23] R. B. Rusu, N. Blodow, Z. Marton, M. Beetz,”Aligning point cloud views using persistent fea-ture histograms”. In: Proc. IEEE/RSJ Int. Conf.on Intelligent Robots and Systems, Nice, 2008,pp. 3384–3391.

[24] R. B. Rusu, S. Cousins, ”3D is here: Point CloudLibrary (PCL)”. In: Proc. IEEE Int. Conf. on Robot.and Automat., Shanghai, 2011, pp. 1–4.

[25] A. Segal, D. Haehnel, S. Thrun, ”Generalized-ICP”.In: Proc. of Robotics: Science and Systems, Seattle,2009 (on-line).

[26] P. Skrzypczyński, ”Simultaneous localizationand mapping: a feature-based probabilisticapproach”, Int. Journal of Applied Mathematicsand Computer Science, 19(4), 2009, pp. 575–588.

[27] P. Skrzypczyński, ”Laser scan matching for self-localization of a walking robot in man-made en-vironments”, Industrial Robot: An InternationalJournal, 39(3), 2012, pp. 242–250.

[28] O. Stasse, A. Davison, R. Sellaouti, K. Yokoi, ”Real-time 3D SLAM for humanoid robot consideringpattern generator information”. ”‘n: Proc. IEEEInt. Conf. on Intelligent Robots and Systems, Bei-jing, 2006, pp. 348–355.

[29] B. Steder, R. B. Rusu, K. Konolige, W. Burgard,”Point feature extraction on 3D range scans tak-ing into account object boundaries”. In: Proc.IEEE Int. Conf. on Robot. and Automat., Shanghai,2011, pp. 2601–2608.

[30] A. Stelzer, H. Hirschmüller, M. Görner, ”Stereo-vision-based navigation of a six-legged walk-ing robot in unknown rough terrain”, Int. Jour-nal of Robotics Research, vol. 31, no. 4, 2012,pp. 381–402.

[31] T. Stoyanov, A. Louloudi, H. Andreasson, A. Lilien-thal, Comparative evaluation of range sensor ac-curacy in indoor environments, in: Proc. 5th Eu-ropean Conf. on Mobile Robots (ECMR), Örebro,2011, pp. 19–24.

[32] K. Walas, D. Belter, Messor – Verstatile walkingrobot for search and rescue missions, Journalof Automation, Mobile Robotics & Intelligent Sys-tems, vol. 5, no. 2, 2011, pp. 28–34.

52


Articles52


�� S�� A�� R�� S�� I��

�� S�� A�� R�� S�� I��

�� S�� A�� R�� S�� I��

�� S�� A�� R�� S�� I��

�u��ed: 1�th January 2013; accepted: 4th June 2013

Mateusz Żarkowski

DOI: 10.14313/JAMRIS_4-2013/53

Abstract:�he ar�cle presents the preliminary concept of facialemo�on recogni�on system. �he approach focuses on thefeature e�trac�on and selec�on process which simplifiesthe stage of defining and adding new elements to thefeature set. �he evalua�on of the system was performedwith two discriminant analysis classifiers, decision treeclassifier and four variants of k-nearest neighbors classi-fier. �he system recogni�es seven emo�ons. �he verifica-�on step u�li�es two databases of face images represent-ing laboratory and natural condi�ons. �ersonal and in-terpersonal emo�on recogni�on was evaluated. �he best�uality of classifica�on for personal emo�on recogni�onwas achieved by �� classifier, the recogni�on rate was��.�% for the laboratory condi�ons and ��.�% for natu-ral condi�ons. �or interpersonal emo�on recogni�on therate was 82.5%.

Keywords: machine learning, image processing, socialrobo�cs

�� trod�c�o�Emotions are inherent part of human nature and

can be observed in the gestures of people, their wayof movement or the tone of their voice, however themost important tool for emotional expression is theface. The �irst scienti�ic study of human facial expres-sions was done by Duchenne [1] and Darwin [2]. Mod-ern psychologists de�ine six basic facial expressionsof emotions – anger, disgust, fear, happiness, sadnessand surprise [3]. As a way to standardize the processof emotion recognition, speci�ic groups of facial mus-cles are de�ined as facial Action Units (AU), which to-gether form Facial Action Coding System (FACS) [4].This system de�ines rules of how to compose FacialAction Units into particular facial expressions, as de-scribed in [5].

The basic structure of automatic facial expressionanalysis (AFEA) as desribed in [6, 7] consists of fol-lowing steps: face acquisition, facial data extractionand representation, and facial expression recognition.The goal of face acquisition is detection and localisa-tion of the face region in the input image or video se-quence, most systems use Haar-based facial detectionintroduced by Viola and Jones [8]. In facial feature ex-traction for expression analysis, twomain approachescan be distinguished: geometric feature-based meth-ods [9, 10] and appearance-based methods [11]. Thegeometric facial features represent the position andshape of facial components (such as mouth, brows,

eyes, etc.). The appearance-based methods utilize im-age �ilters (ie. �abor wavelets) applied over selectedregions of the face image to extract feature vectors.The facial expression recognition system can classifythe facial changes as prototypic emotional expressions(anger, disgust, fear, happiness, sadness and surprise)or as facial action units (AU of FACS) mentioned ear-lier. Moreover, depending on the utilization of tempo-ral information frame-based and sequence-based ap-proaches are de�ined. The additional review of studiesin the �ield of facial expression recognition has beendescribed in [12] and [13].

This paper serves an extended version of the arti-cle [14], which presents the preliminary results on theconstruction of emotion recognition system based onsingle-frame facial image analysis for its applicationby a social robot during human-robot interaction. Dueto the nature of such interaction, the accuracy and thespeed of the system are crucial, in order for the robotnot to lose its social credibility. Seven emotion classesare taken into account: six basic emotions and a neu-tral facial expression. System performance in the caseof one-person and multipersonal situations is evalu-ated. To solve the face acquisition problem the posi-tionof the face is trackedbypattern �ittingwith theuseof Active Shape Models and Active Appearance Mod-els [15, 16], which results in a 3-D mesh model. TheFaceTracker [17] application is utilized for this pur-pose, because of its ability to provide real-time facetracking along with a good mesh resolution. Also, thesoftware is open-source, making it easy to incorporateinto the architecture of a social robot. The resultingfacial mesh model is then subject to feature extrac-tion. In usual approach, the number of extracted fea-tures is low, as each feature describes a speci�ic geo-metric property of the face, such as width of the lipsor the angle of the brows. For example, in [18] only 24features, 15 parameters of upper face and 9 parame-ters of lower face are de�ined. This becomes the seri-ous issue in all recognition systems. How do we knowif the chosen feature set is suf�icient� �ould the ad-dition of a missing feature result in an improvementof classi�ication accuracy� How to �ind such features�The approach for feature extraction proposed in thispaper is similar to brute-force attack in cryptogra-phy. A large number of features is calculated from the3-D mesh model, then a feature selection algorithmshould distinguish the features that best discriminatethe given facial emotional expressions. The expres-sion classi�ication is performed with basic classi�iers(discriminant analysis, k-nearest neighbours, decision

53


Articles 53


trees), which should emphasize the general nature ofthe proposed methodology. Support Vector Machinesand Neural Networks are often used in facial expres-sion recognition, however, in this case, they could ob-scure the results of feature extraction step. Moreover,due to themodular nature of the system,more sophis-ticated classi�ication methods can be easily applied ifneeded. The results presented here should serve as astarting point for further system development.

The emotion recognition system consists of staticfacial image analysis in order to classify it as oneof seven emotion classes. Before proper application,the system requires to be trained. The architectureoverview is presented in �ig. 1. The following subsec-tions describe the components of the system.

The �irst step of the image processing is face de-tection and tracking. This results in a set of parame-ters that describe the selectedproperties of an aquiredface that are used for further analysis. For this pur-pose the FaceTracker [17] application, developed byJ. Saragih, was utilized. It performs face detection andtracking by iterative �itting of deformable 3� mod-els of human face. More information about this appli-cation can be found in [19] and [20]. Yielded modelconsists of 66 points and connections between themthat envelop the important features of the face. Thepoints and connetions together form a three dimen-sional grid - amesh. This model adapts to the face thatit is tracking, taking into acount such parameters asthe positioning of eyebrows, mouth and jaw. The ex-ample of FaceTracker performance is shown in �ig. 2.This �itting results in a set of points in a three dimen-sional space that describe the deformed model in itsstandardized orientation, the scale of the model andits orientation. The above parameters along with thedesciprition of themesh connections are transfered tothe next processing stage.

Presented methodology is based on performingthe training stage before the proper use of the system.The training data have a twofold task. First, becauseof this data it is possible to reduce the dimensional-ity of the feature vector by identifying the featuresthat provide the best distinction between the recog-nized classes. At the system application stage, only thefeatures that carry the most important informationare calculated, which speeds up the feature extractionprocess signi�icantly. Second, the reduced feature setis used to train the classi�ier. The next section presentshow the features are calculated and tools that are usedfor feature selection and classi�ication.

In the proposed approach, the identi�ication sys-temuses only the geometric data obtained through theFaceTracker application, without taking into accountany information about the texture. Pre-processing andparametrization of the face produces a set P consist-ing of 66 vectors that desribe the face. These vec-

tors represent the position of characteric points of thethree-dimensional model that has been a subject tothe subsequent deformations during the process offace detection. They are formed by the projection ontothe frontal plane, which due to the standardized formof the model (both model scale and its orientation areseparate parameters) requires only the removal of thedepth component. Let pxi and pyi be the correspond-ing components of the i-th vector, and peuci denote itsnorm.

P ={pi : pi ∈ IR2; i = 1, . . . , 66

},

pi = (pxi , pyi ), p

euci =

√(pxi )

2 + (pyi )2. (1)

Also, we introduce a set of vectorsA describing theaverage face, which is a special case of the set P . Cal-culation of the characteristic points Ai of the set A isdone by averaging all the faces in the training set. Thisprocedure allows to determine the deviation ∆pi be-tween the face that is being processed and the averageface of the whole training set

∆pxi = pxi −Axi , (2)

∆pyi = pyi −Ayi , (3)

∆peuci =√(∆pxi )

2 + (∆pyi )2. (4)

LetD denote the set of distances between all of thecharacteristic points, while also adding the constraintof no redundancy of this data. Thismeans that the car-dinality (size) of the setD is

(662

)= 2145

D = {di,j : i = 1, . . . , 65; j = i+ 1, . . . , 66} . (5)

The proper combination of the characteristic pointsfrom the setP is de�ined by themesh of themodel. Letthe mesh consist of T (in this case T = 91) triangleswhose vertices are labeled by the indices of the char-acteristic points. Let us de�ineC as the set of triples ofnatural numbers, which represent the indices of ver-tices for each triangle. This allows to calculate the an-gles α inside each of these triangles

α ={α(i,j,k), α(j,k,i), α(k,i,j) : (i, j, k) ∈ C

}(6)

α(i,j,k) = arccos(

pji · pjk|pji||pjk|

), pij = pj − pi . (7)

As in the case of positions, angles are also calcu-lated for the average face A, which allows the calcu-lation of relative changes in these angles. By denotingthe angles of the average face as αA

i,j,k , these changesare describe with the following equation

∆α ={∆α(i,j,k) = α(i,j,k) − αA

(i,j,k)

}. (8)

54


Articles54


Finally, the elements described by above form afeature vector f by concatenation. The resulting vec-tor is of following size: 66 points · 6 + 2145 distances+ 91 triangles · 3 angles · 2 = 3087

f = [px py peuc ∆px ∆py ∆peuc D α ∆α]T. (9)

�ue to the signi�icant size of the vector f , the fea-ture selection process is needed. Asmentioned earlier,

selection reduces the dimension of the feature vec-tor, which leads to increasing the speed of calculationsand simpli�ies the classi�ication. Feature selection canbe easier understood as a case of state space search.The state is de�ined as a certain subset of selectedfeatures, and the actions consist of adding new fea-tures to this subset or removal of any of those present.The whole problem can be then decomposed into twotasks: evaulation of a subset of attributes andmethods

55


Articles 55


for searching the state space. This heuristic approachis necessary because the large size of the feature vec-tor prevents a brute force search of the state space.The features selection is a single-time process that oc-curs only during the training of the system. The des-ignated set of features is used to build the classi�ier,while the use of the system, the knowledge of the in-dices of selected features allows the calculation of onlythe chosen attributes. The task of selection was car-ried out using the software WEKA [21]. To evaluatethe subset of attributes the evaluation method basedon correlation CfsSubsetEval was used, which is de-scribed in [22]. As a method of state space search al-gorithm BestFirst was selected. The choice of thesemethods was done on the basis of previous studies[23] and [24].

The classi�ication problem here is to build such aclassifying function which is able to correctly assigna particular object to the corresponding class throughthe appropriate processing of the training data by su-pervised learning. All of the the data is strictly numeri-cal and labeled. In the case of the presented identi�ica-tion system, the recognition is done by using elemen-tary classi�iers, such as linear (DA�in) and quadratic(DA�uad) discriminant analysis classi�iers, k-nearestneighbour (KNN) for k = 1, 3, 5, 7 as well as decisiontrees (DecTree). Due to the modular architecture ofthe system it is possible to employmore complex clas-si�iers (ie. neural network, support vector machine),however, for the purpose of clarity only simple classi-�iers are used in this paper.

3. Results3.�. ��t� ��u�s��

In order to evaluate the performance of the systemtwosets of datawereused: images gathered in labora-tory environment and in natural conditions. Solvingan easier problem (laboratory conditions) should de-termine whether the proposed feature set is suf�icientfor correct classi�ication. Then, if the system workssatisfactorily, one should consider introducing addi-tional factors.

The MUG facial expression database [25] wasused as input data set for laboratory conditions. Thedatabase contains frontal face images for more than50 people aged 20-35 years. Each personwas asked toexpress six basic emotions and a neutral facial expres-sion. The pictures were taken at 896x896 resolution,at a rate of 19 frames per second, in a professionallyequippedphotography studio under favorable lightingconditions.Due to the fact that the recorded sequencesstart and return to a neutral facial expression, it wasnecessary to extract the appropriate emotion imagesfrom the middle of the sequence.

Thenatural conditionswere simulatedby conduct-ing photo sessions for three people at the age of about25 years. The pictureswere takenby a standard laptopcamera, working at the resolution of 640x480 in nor-mal room lighting conditions. People who took partin the experiment were instructed on how to gener-ate speci�ic facial expressions depicting a particular

emotion in accordance to FACS system [4]. During therecording the person looked around the immediatevicinity of the camera, thus simulating the gentle headmovements which occur during a standard conversa-tion. For each person, six photo sessions took place,each consisting of seven emotion classes. For everyemotion in each session ten pictures were taken. To-tal number of images for every person was 6 sessions·7 emotion classes ·10pictures = 420 face images. Theinterval betweenpictureswas set at 800ms, due to theperformed movements of the head.3.�. �e��e e��lu��

In order to estimate the practical accuracy ofthe designed emotion recognition system two exper-iments were conducted: personal and interpersonalrecognition.

The goal of the �irst experiment was to recognizethe facial emotions of a single person using only theimages of his/her own face. This results in a per-sonal recognition system which should perform wellfor a given person, while omitting its usefulness forothers. Such a system was constructed and evalu-ated for six people – three from laboratory conditionsdatabase and three from natural conditions database.Moreover, two additional evaluationsweremade – thedata of three people from laboratory conditions weremerged into one dataset that was used to build a sys-tem designed for emotion recognition of these threepeople combined, the data of three representatives ofnatural conditions were subject to the same process.The assessment of the results was done through 10-fold crossvalidation – the input data set was dividedinto ten subsets, next the recognition processwas per-formed ten times, each time one of the subsets waschosen as validation data while the others were usedto train the system. It is vital to say that the feature se-lection process is a integral part of system training andwas performed anew for each of the training data. Thesuccess rate was evaluated as a ratio of correctly clas-si�ied instances to the total number of samples. Theresults are shown in tab. 1. In each case, the k-nearestneighbor classi�ier performs best. The greater numberof neighbor decreases the success rate of the classi�i-cation but increases the robustness of the system if thetraining data are subject to noise. It is noteworthy that,while performing very well for laboratory conditions,the discriminant analysis classi�iers struggle with nat-ural conditions andmerged datasets. This leads to twoconclusions. Firstly, complex problems may requirecomplex toolswhich suggests employingmore sophis-ticated classi�iers. Secondly, each modi�ications at fea-ture extraction and selection steps should be evalu-ated by classi�iers of lower success rate. Although thisis seemingly counter-intuitive, it allows for a clearerobservation of improvement. In addition, less complexclassi�iers usually run faster, so if their effectivenesswill be comparable to the more complex (but slower)classi�iers (see tab. 1, laboratory conditions), one canobtain increased speed at the expense of the ef�iciencyof the recognition system.

The second experiment focused on recognizing the

56


Articles56


�ab� �� Success rate �in �� for personal emo�on recogni�on for LA�orator� and NA�ural condi�ons�Numbers represent specific people, ALL represents a combined dataset consis�ng of all sub�ects

LAB 1 LAB 2 LAB 3 LAB ALL NAT 1 NAT 2 NAT 3 NAT ALL

DALin 99.67 99.79 99.71 92.63 86.19 88.81 94.52 80.95DAQuad 99.67 99.58 100.00 98.23 89.29 94.52 97.38 84.05DecTree 96.28 98.75 99.71 96.54 88.81 90.95 95.48 85.87KNN1 100.00 100.00 100.00 99.91 95.71 99.05 99.52 97.30KNN3 100.00 100.00 100.00 99.91 93.81 97.86 99.29 96.35KNN5 100.00 100.00 99.71 99.91 94.05 97.62 98.81 95.63KNN7 100.00 100.00 99.71 99.91 92.38 97.62 98.33 95.56

�ab� �� Success rate �in �� for interpersonal emo�on recogni�on for LA�orator� condi�ons�Numbers represent specific people, AVG is an average performace of a classifier

LAB 1 LAB 2 LAB 3 LAB 4 LAB 5 LAB 6 LAB 7 LAB 8 LAB 9 LAB 10 AVG

DALin 80.00 75.71 100.00 74.29 87.14 60.00 74.29 77.14 57.14 58.57 74.43DAQuad 91.43 81.43 91.43 94.29 87.14 51.43 88.57 85.71 75.71 51.43 79.86DecTree 75.71 87.14 75.71 65.71 85.71 55.71 65.71 98.57 68.57 70.00 74.86KNN1 95.71 87.14 95.71 82.86 77.14 67.14 77.14 87.14 90.00 60.00 82.00KNN3 91.43 87.14 97.14 82.86 77.14 67.14 82.86 94.29 75.71 70.00 82.57KNN5 91.43 87.14 98.57 84.29 80.00 68.57 77.14 92.86 72.86 68.57 82.14KNN7 91.43 85.71 95.71 84.29 82.86 74.29 80.00 90.00 74.29 68.57 82.71

�ab� �� omparison of success rate for personal emo�on recogni�on bet�een full feature vector �NS � no selec�on�and a SELected feature vector

NAT1 NS NAT1 SEL NAT2 NS NAT2 SEL NAT3 NS NAT3 SEL NS AVG SEL AVG

DALin 73.33 86.19 85.00 88.81 89.05 94.52 82.46 89.84DAQuad 79.52 89.29 87.86 94.52 97.14 97.38 88.17 93.73DecTree 86.19 88.81 90.71 90.95 93.81 95.48 90.24 91.75KNN1 95.24 95.71 98.33 99.05 99.29 99.52 97.62 98.09KNN3 94.76 93.81 98.10 97.86 99.05 99.29 97.30 96.99KNN5 93.10 94.05 97.14 97.62 98.33 98.81 96.19 96.83KNN7 89.52 92.38 96.43 97.62 97.86 98.33 94.60 96.11

facial emotions of a person that is completely new tothe system based on the portrayals of emotions sub-mitted by other people. This interpersonal recogni-tion system should be able to correctly assign emo-tions to various people, however the recognition suc-cess rate is expected to drop signi�icantly in compar-ison with the personalized system. The system wasconstructed using data from ten people, all from labo-ratory conditions database. In respect to crossvalida-tion, each person was considered a subset of its own– the recognition process was done ten times, eachtime the emotions of ninepeoplewereused as trainingdata while the emotions of the remaining person wastreated as validation data. The results are presentedin tab. 2. The drop in success rate is indeed signi�icant.This is due to the personal differences in anatomy aswell as in execution of certain emotion by a speci�icperson. One can observe that two highest recognitionresults – subject 1 and 3 – are expected to have verysimiliar anatomy and manner of facial expressions. Infact, they both arewomen of delicate body consitution

and their expressions are much alike. In case of thelowest result – subject 10 – the confusion matrix (notincluded here) shows that two of his emotional ex-pressions were classi�ied as entirely wrong, whereasother �ive were mostly correct. This is most probablydue to the lack of appropriate template in the train-ing data. In conclusion, ten people are not enough tocover thewhole feature space for emotion recognition– more samples are needed – however, the obtainedsuccess rate of above 80% is still satisfactory, but canbe easily improved. The results are inconclusive as towhich classi�ier has performed the best.

The achieved recognition rate is dif�icult to com-pare between diffent expression recognition systems,due to the different databases used for system evalu-ation. However, results obtained by other researcherscan be found in [6].

The results of classi�ication in the case whenthe feature selection process has been omitted arepresented in tab. 3. Feature selection improves the

57


Articles 57


recognition quality and at the same time signi�icantlyspeeds up the system performance – a vector of morethan 3000 features is reduced to about 100 elements,which drastically simpli�ies feature calculation as wellas classi�ier training and use.

4. Summary

This paper presents a preliminary concept of vi-sual emotion recognition system for the use of a so-cial robot. The proposed methodology focuses on fea-ture extraction and selection, which simpli�ies de�in-ing new features and adding them to the feature set.Face tracking and parametrization is performed bya designated application FaceTracker [17], while fea-ture selection is done by data mining software WEKA[21]. The evaluation of the system uses two discrim-inant analysis classi�iers, decision tree classi�ier andfour variants of k-nearest neighbors classi�ier. The sys-tem recognizes seven emotions. The veri�ication steputilizes twodatabases of face images representing lab-oratory and natural conditions. Personal and inter-personal emotion recognitionwas evaluated. The bestquality of classi�ication for personal emotion recog-nition was achieved by 1�� classi�ier, the recogni-tion rate was 99.9% for the laboratory conditions and97.3% for natural conditions. For interpersonal emo-tion recognition the rate was 82.5%.

The obtained results are promising and suggestfurther development of proposedmethodology. Thereare different directions in which the project can be de-veloped. One can easily add new features, such as tex-ture information, in order to enhance the classi�icationrate. More sophisticated classi�iers can be used to in-crease the robustness of the recognition system. Moretraining data can be presented for better representa-tion of the feature space. Moreover, adaptive learningalgorithms are needed for practical application in a so-cial robot working in a dynamic environment. Authorstrongly believes inmultimodal approach for the emo-tion recognition problem. Combining the visual cueswith speech processing can provide more plausibleemotional information. The immediate step will focuson processing of the obtained information in order forcontextual analysis. Somework has already been donein the direction of utilizing this software for peopleidenti�ication. Being able to tell people appart, the sys-tem can serve with personal emotion recognition anda social robot should provide an individual experiencefor each user.

Acknowledgements

This research was supported by Wroclaw Univer-sity of Technology under a statutory research project.

AUTHORMateusz Żarkowski∗ – Wroclaw Universityof Technology, Institute of Computer Engi-neering, Control and Robotics, e-mail: ma-

[email protected].∗Corresponding author

REFERENCES[1] G.B. Duchenne de Bologne, Mechanisme de la Phy-

sionomie Humaine. Paris, Jules Renouard Libraire1862.

[2] C. Darwin, The Expression of the Emotions in Manand Animals. Anniversary edition, Harper Peren-nial 1872/2009.

[3] P. Ekman, Emotion in the Human Face. CambridgeUniversity Press 1982.

[4] P. Ekman, W. Friesen, Facial Action Coding System:A technique for the measurement of facial move-ment. Palo Alto, Consulting Psychologists Press1978.

[5] M. Pantic, L. Rothkrantz, ”Expert system for au-tomatic analysis of facial expressions”, Image andVision Computing Journal, 2000, vol. 18, no. 11, pp.881–905.

[6] Y. Tian, T. Kanade, J. Cohn, Facial Expression Recog-nition. Handbook of FaceRecognition, 2nd ed. 2011,Springer

[7] M. Pantic, M. Barlett, ”Machine Analysis of FacialExpressions”, In: Face Recognition. I-Tech Educa-tion and Publishing, 2007, pp. 377–416.

[8] P. Viola, M. Jones, ”Robust real-time face detec-tion”, International Journal of Computer Vision,2004, vol. 57, pp. 137–154.

[9] X. Li, Q. Ruan;, Y. Ming, ”3D Facial expressionrecognition based on basic geometric features”,2010 IEEE 10th International Conference on Sig-nal Processing (ICSP), 2010.

[10] I. Kotsia, I. Pitas, ”Facial expression recogni-tion in image sequences using geometric defor-mation features and Support Vector Machines”,IEEE Trans Image Process, vol. 16, no. 1, 2007, p.172–187.

[11] C. Lien, L. Lin; C. Tu, ANewAppearance-Based Fa-cial Expression Recognition Systemwith ExpressionTransition Matrices. Innovative Computing Infor-mation and Control, 2008.

[12] M. Pantic, L. Rothkrantz, ”Automatic analysisof facial expressions: The state of the art”, IEEETrans. Pattern Anal. Mach. Intell., December, 2000,vol. 22, vo. 12, pp. 1424–1445.

[13] B. Fasel, J. Luettin, ”Automatic facial expressionanalysis: A survey”, PatternRecognition, 1999, vol.36, no. 1, pp. 259–275.

[14] M. Żarkowski, ”Facial emotion recognitionin static image”. In: 12 National Conferenceon Robotics, Świeradów-Zdrój, 2012, vol. 2, pp.705–714 (in Polish).

[15] T. F. Cootes et al. ”Active Shape Models – theirtraining and application”, Comput. Vis. Image Un-derst., January 1995, vol. 61, no. 1, pp. 38–59.

58


Articles58


[16] T. Cootes, G. Edwards, C. Taylor. ”Active Appear-ance Models”, IEEE Trans. Pattern Anal. Mach. In-tell., June, 2001, vol. 23, no. 6, pp. 681–685.

[17] J. Saragih, FaceTracker,http://web.mac.com/jsaragih/FaceTracker/FaceTracker.html

[18] Y. Tian, T. Kanade, J. Cohn. Recognizing actionunits for facial expression analysis IEEE Trans.Pattern Anal. Mach. Intell., 23(2), 2001, pp. 1–19.

[19] J. Saragih, S. Lucey, J. Cohn, ”Face alignmentthrough subspace constrained mean-shifts. In:ICCV. Proceedings, 2009, pp. 1034–1041.

[20] J. Saragih, S. Lucey, J. Cohn, ”Deformable model�itting by regularized landmarkmean-shift”. Inter-national Journal of Computer Vision, 2011, nol. 91,no. 2, pp. 200–215.

[21] Weka 3: Data Mining Software in Java,http://www.cs.waikato.ac.nz/ml/weka/

[22] M.A. Hall, Correlation-based Feature Subset Selec-tion for Machine Learning. PhD thesis, Universityof Waikato, Hamilton, New Zealand, 1998.

[23] Ł. Juszkiewicz, Speech emotion recognition for asocial robot. Master thesis, Wrocław University ofTechnology, Wrocław, 2011 (in Polish).

[24] M. Żarkowski, Set of procedures helping throughthe learning process of using EMG signals for con-trol. Master thesis, Wrocław University of Tech-nology, Wrocław, 2011 (in Polish).

[25] N. Aifanti, C. Papachristou, A. Delopoulos, ”TheMUG facial expression database”. In: 11th Int.Workshop on Image Analysis for Multimedia In-teractive Services. Proceedings, Desenzano, Italy,April 12–14, 2010.

59


[16] T. Cootes, G. Edwards, C. Taylor. ”Active Appear-ance Models”, IEEE Trans. Pattern Anal. Mach. In-tell., June, 2001, vol. 23, no. 6, pp. 681–685.

[17] J. Saragih, FaceTracker,http://web.mac.com/jsaragih/FaceTracker/FaceTracker.html

[18] Y. Tian, T. Kanade, J. Cohn. Recognizing actionunits for facial expression analysis IEEE Trans.Pattern Anal. Mach. Intell., 23(2), 2001, pp. 1–19.

[19] J. Saragih, S. Lucey, J. Cohn, ”Face alignmentthrough subspace constrained mean-shifts. In:ICCV. Proceedings, 2009, pp. 1034–1041.

[20] J. Saragih, S. Lucey, J. Cohn, ”Deformable model�itting by regularized landmarkmean-shift”. Inter-national Journal of Computer Vision, 2011, nol. 91,no. 2, pp. 200–215.

[21] Weka 3: Data Mining Software in Java,http://www.cs.waikato.ac.nz/ml/weka/

[22] M.A. Hall, Correlation-based Feature Subset Selec-tion for Machine Learning. PhD thesis, Universityof Waikato, Hamilton, New Zealand, 1998.

[23] Ł. Juszkiewicz, Speech emotion recognition for asocial robot. Master thesis, Wrocław University ofTechnology, Wrocław, 2011 (in Polish).

[24] M. Żarkowski, Set of procedures helping throughthe learning process of using EMG signals for con-trol. Master thesis, Wrocław University of Tech-nology, Wrocław, 2011 (in Polish).

[25] N. Aifanti, C. Papachristou, A. Delopoulos, ”TheMUG facial expression database”. In: 11th Int.Workshop on Image Analysis for Multimedia In-teractive Services. Proceedings, Desenzano, Italy,April 12–14, 2010.

59


S E R S S RS E R S S RS E R S S RS E R S S R


Łukasz Juszkiewicz

DOI: 10.14313/JAMRIS_4-2013/59

Abstract:The paper presents a speech emo on recogni on systemfor social robots. Emo ons are recognised using globalacous c features of the speech. The system implementsthe speech parameters calcula on, features extrac on,features selec on and classifica on. All these phases aredescribed. The system was verified using the two emo-onal speech databases: Polish and German. Perspec-ves for using such system in the social robots are pre-

sented.

Keywords: speech emo on recogni on, prosody, ma-chine learning, Emo-DB, intona on, social robot

1. Introduc onRecently, a new ield of robotics is being devel-

oped: a social robotics. A social robot is able to com-municate with people in an interpersonal manner andachieve social and emotional goals [25]. Social robotsaremeant to collaboratewithpeople andbe their com-panions in applications such as education, entertain-ment, health care etc.

Emotions play a signi icant role in an interpersonalcommunication, so an ability to recognise other peo-ple’s emotions from speech and adapting one’s re-sponse is an important social skill. Person without it,would have dif iculties with living in a society. Simi-larly, the social robot needs to be able to recognise ex-pressions of emotions, so that it could communicatewithman in the natural manner and could be his com-panion.

Information on emotion is encoded in all aspects oflanguage, in what is said and in how it is said or pro-nounced, and the “how” is even more important thanthe “what” [29]. Considering all levels of language,frompragmatics down to the acoustic level, the follow-ing thing can be said: Starting with pragmatics, inten-tion of speaker is highly correlated with his emotionalstate [22]. The literal meaning of an utterance is themost obvious display of emotions, so the statementsor keywords such as “I am sad” can be treated as emo-tion indicators [19]. However, explicit expressions ofemotions can be intended as ironic or not express trueemotions of the speaker.

Another indicator of the emotions is a tone of avoice, that is the phonetic and acoustic propertiesof speech. For example, the cracking voice could bethe evidence of an excitement. Voice quality and theprosody (pitch, intensity, speaking rate) have beenbest researched in the psychological studies. They also

intuitively seem to bemost important in expression ofemotions. An often cited review of literature on emo-tions in speechwaswrittenbyMurray andArnott [17].They refer to a number of studies which seem to haveidenti ied almost explicit correlation between emo-tions and acoustic parameters. However in the studiesof different authors con licting results can be found.This is probably due to the large variability of the ex-pression of emotions and different variants of the cer-tain emotions, such as “hot” and “cold” anger [8].

The system has been developed to recognise emo-tions in recorded statements on the basis of acousticfeatures of speech. Section 2 discusses the structure ofthe system and its implementation. Section 3 presentsthe results of system’s veri ication. In conclusion theresults are evaluated and further development plansare presented.

2. Speech Emo on Recogni on SystemCommonly used pattern recognition algorithms

are applicable to the speech emotion recognitionproblem. However, there are at least two different ap-proaches. One is estimating the short-time signal pa-rameters andmodelling their changeswith theHiddenMarkovModels or similar [16,18]. Theother is extract-ing global features of the signal and applying statisticalmethods and various types of classi iers: SVM [5, 33],arti icial neural networks [9, 30], decision trees [26]and other. The second approach was chosen — eachutterance is analysed as awhole, so global features areextracted and then classi ied.

There is no compliance about an optimal set of fea-tures for speech emotion recognition. Moreover, sucha set would depend on a number and type of emotionsbeing recognised, language, recording conditions etc.Therefore, a standard approach is to extract very largefeatures vector (even about 4,000) and then reducethe number of features to obtain a subset that has bet-ter discriminative power in particular task [23]. Com-monly used features selection algorithms are prin-cipal component analysis, linear discriminant analy-sis, information gain ratio, sequential forward loatingsearch [24,31,32].

The speech emotion recognition system [28] hasthe form of a set of MATLAB scripts using externaltools — the programs Praat and Weka. Praat is afree (Open Source) program for phonetic analysis ofspeech. It can compute several parameters of speech:pitch, formants, spectrum, MFCC and many others[20]. Weka 3 (Waikato Environment for KnowledgeAnalysis) is a popular suite of machine learning soft-

59


ware written in Java. It contains a collection of visuali-sation tools and algorithms for data preprocessing, il-tration, clasterisation, classi ication, feature selectionand regressionmodelling [10]. It is freely available un-der GNU General Public License. This form of the sys-tem allows easy modi ications and facilitates testingvarious algorithms. The structure of the system is il-lustrated in igure 1. There are two phases of the sys-tem’s operation: learning phase and evaluation phase.In the off-line learning phase feature selection and su-pervised learning of classi ier is carried. In the eval-uation phase the prepared system is used for on-linespeech emotion classi ication.2.1. Speech signal acquisi on

The irst stage of a recognition process is a speechsignal acquisition—an acoustic signal has to be trans-formed into an electrical signal and then digitalised.Quality of a usedmicrophone and a preampli ier playsimportant role — distortions may affect recognitionaccuracy. Regardless of the quality of the recordingequipment there still is a problem of an environmen-tal noise and the noise generated bymotors and othermoving parts of the robot itself. It was assumed thatin recordings used for classi ication, signal to noise ra-tion would be high enough and there would be no ad-ditional voices (one person speaking at the moment).2.2. Speech signal parametrisa on

Todetermine the emotionof the speech, six param-eters of the speech signal were used: intensity, spec-trogram, pitch, mel-frequency cepstral coef icients,long-term average spectrum and harmonics to noiseratio. Some of them were computed using several dif-ferentmethods, because they gave slightly different ef-fects.

Intensity is the instantaneous sound pressure valuemeasured in dB SPL, ie dB with respect to pressure of2 · 10−5Pa [20].

Spectrogram The speech signal is split into 16msframes with a 10ms step. Each frame is Fourier trans-formed to compute the magnitude of the frequencyspectrum. The phase is neglected. Then logarithm istaken from the magnitude spectrum, and the result isappended to a matrix. This matrix is called spectro-gram [12].

Pitch is a fundamental frequency of the speech. Itis produced by vocal folds (known commonly as vo-cal cords). Two algorithms were used to estimatethe pitch: autocorrelation method [1] and cross-correlation method [21]. The pitch exists only for thevoiced parts of the signal, so resultingwaveform is dis-continuous. Built in smoothing and interpolation func-tions are used to overcome this issue.

Mel-frequency Cepstral Coefficients (MFCC) are com-monly used for parametrisation of the speech [20,27].Spectrum of the windowed signal is analysed by a

bank of 26 bandpass ilters with a central frequencyequally spaced on a mel-scale, re lecting the subjec-tively perceived pitch. Subsequently, a discrete cosinetransform is used to a logarithmised spectrum into acepstrum. Only the irst 12 coef icients are used. Ad-ditionally, a 13rd series is computed as the average ofall 12 series of the coef icients.

Harmonics to noise ra o (HNR) is energy of the har-monic parts of the signal related to the energy ofthe noise parts. HNR is expressed in dB and com-putedusing the autocorrelationmethod and the cross-correlation method [1].

Long-Term Average Spectrum (LTAS) is averaged log-arithmic power spectral density of the voiced partsof the signal with an in luence of the pitch correctedaway [13].2.3. Feature extrac on

In order to extract more useful information fromobtained parameters vectors additional vectors arederived from them:- irst and second order difference,- values of local minima,- values of local maxima,- distance between adjacent extrema,- value of difference of the adjacent extrema,- slopes between the adjacent local extrema,- absolute values of the two above.The acoustic features are time-series — their lengthdepends on the duration of analysed utterance. Forclassi ication purposes, it is necessary to convert thetime series into a feature vector of ixed length. Thisis achieved by treating time series as outcomes of ran-dom variables and computing their statistics:- arithmetic mean,- median,- standard deviation,- global maximum,- global minimum,- irst quartile,- second quartile,- range,- interquartile range.Further they will be referred to as the basic statistics.

Algorithm of extracting the features from the rawacoustic features is basically uniform (except for spec-trogram and LTAS). For each of them, treated as timeseries, there are computed:- basic statistics,- linear regression coef icients,- basic statistics of derived series,- linear regression coef icients of local maxima,- linear regression coef icients of local minima.

60


Speech signal acquisition

(MATLAB)

Feature extraction

(Praat & MATLAB)

Feature selection

(Weka)

Removing

unnecessary

features

(Weka)

Classificator training

(Weka)

Classification

(Weka)

Training

Testing

Feature extraction

(Praat & MATLAB)Speech signal acquisition

(MATLAB)

Fig. 1. Block diagram of speech emo on recogni on system.

For each of the spectra forming spectrogram andfor LTAS, there are computed:- linear regression coef icients,- centre of gravity, de ined as CG = ΣfiEi

ΣEi, where: fi

— i-th frequency,Ei —energy of fi,- 9th and 1st decile difference,- slope between global maximum and minimum.For LTAS, coef icients listed above are inal features.For spectrogrambasic statistics of them are computedto obtain inal features.

Using this method 1722 features are generated.The structure of the feature vector is illustrated in ig-ure 2. The number of features is then reduced in thefeature selection process.

Fig. 2. Structure of feature vector.

2.4. Feature selec on

Weka’s function AttributeSelection was usedto remove features that are redundant or not rel-evant for this particular task. This function re-quires subset evaluation and search functions. Forevaluation correlation-based feature subset selectionCfsSubsetEval was used — subsets of features thatare highly correlated with the class while having lowintercorrelation are preferred [11]. Search functionwas BestFirst, which searches the space of attributesubsets by greedy hill climbing augmented with abacktracking facility.

Feature selection is done in training phase. There-fore the selected feature vector depends on the train-ing set.

2.5. Classifica on

Several different classi iers, provided by the Wekapackage, were tested:- multilayer perceptron,- support vectormachinewith sequentialminimal op-timisation,

- radial basis function network,- Bayes network.Results obtained from each of those classi iers werevery similar. The feature selection stage seems tobe crucial — selecting minimal vector of discrimi-native features makes the choice of classi ier a mi-nor problem. In section 3 the average recognitionratios for every classi ier are presented. However,detailed classi ication results are presented only forBayesNet classi ier — Bayes network that can usevarious search algorithms and quality measures [2].Here, SimpleEstimator (estimating probabilities di-rectly from data) and K2 (implementing hill climbing)were used [7].

Advantage of Bayes Network classi ier is the rela-tively low complexity while maintaining performancecomparable with more complex classi iers. Additionaladvantage is explicit representation of knowledge —created network can be analysed or used for otherpurposes. Learning of Bayesian classi ier is proneto mutual correlation of features — redundant onescould be dominant [14]. However, the algorithm usedfor feature selection removes redundant features, sothis problem is eliminated.

3. ResultsThe tests were carried in two stages. In the irst

stage, two available databases of acted emotionalspeech: Berlin Database of Emotional Speech andDatabase of Polish Emotional Speech were used. Theycontain recordings registered in studio conditions, dif-ferent from those that are expected in social robot

61


application. Those test, however, can verify the cho-sen methodology. The second stage is veri ication forthe intended use of system — test recordings shouldmeet typical responses to social robot and should notbe recorded in such arti icial conditions as mentioneddatabases.

3.1. Speech corpora

Berlin Database of Emo onal Speech (Emo-DB) wasrecorded at the Technical University of Berlin [4]. Fivemale and ive female carefully chosen speakers pre-tended six different emotions (anger, joy, sadness,fear, disgust and boredom) and neutral state in ten ut-terances— ive consisting one phrase and ive consist-ing two phrases. Every utterance is in German and hasemotionally neutralmeaning. After recordings 20peo-ple were asked to classify emotion of each utteranceand rate its naturalness. Utterances thatwere not clas-si ied correctly by more than four persons or consid-ered as unnatural by more than two were discarded.Resulting database consists of 493 recordings.

Database of Polish Emo onal Speech consists of 240recordings [15]. Four male and four female actorspeakers were asked to enact in ive phrases ive dif-ferent emotions (anger, joy, sadness, fear and bore-dom) as well as the emotionally neutral state. Phrasesare uttered in Polish and their meaning is emotionallyneutral. The number of recordings is the same for eachemotion. These recordings were evaluated by 50 hu-mans — each of them was asked to classify 60 ran-domly selected samples. The average recognition ratiowas 72%.

3.2. Speaker independent

Tests were carried out for the full set of recordingsform both databases excluding those marked as dis-gust. There are no recordings of this emotion in Polishdatabase, so itwas ruled out to to keep the same condi-tions for both tests. Tenfolds cross-validationwasusedto evaluate classi ication accuracy.

Emo-DB A subset of Emo-DB used for experimentsconsisted of 489 utterances: 71 of joy, 62 of sadness,69 of fear, 127 of anger, 81 of boredom and 79 neutral.Selected features vector was 123 elements long. Table1 summarises numbers of the features derived fromeach of the parameters as well as the maximum andaverage information gain ratio (IGR) of those features.Despite the fact, that features derived from the MFCCand the pitch form a major part of vector, LTAS fea-tures have the higher average IGR. All other acousticparameters are represented among selected features— each of the speech parameters is relevant and non-redundant. However, it should be noticed that HNRhas signi icantly lower contribution thanother param-eters. System classi ied properly 396 (81,98%) in-stances using the Bayes net classi ier. Table 2 showsconfusion matrix and accuracy of classi ications ofeach emotion. Results achieved using other classi ierswere: 79.7% for perceptron, 80.1% for RBF network

Tab. 1. Number of features derived from the individualparameters of speech signal for Emo-DB

Parameter No. of feat. Max. IGR Avg. IGRMFCC 45 0.446 0.258Pitch 29 0.434 0.241Intensity 22 0.478 0.245HNR 13 0.189 0.133Spectrogram 8 0.429 0.213LTAS 6 0.440 0.268Σ 123

and 80.9% for SVM. In comparison in the study [30]accuracy of 79.47 % was achieved. In the studies [26]and [29] whole database (including disgust) was usedand respectively 78.58% and 66,5% were archived.

Tab. 2. Confusion matrix for Emo-DB

Classi ied asN J S F A B

Actualcla

ssN 68 0 1 3 0 7 86,1%J 2 37 0 8 24 0 52,1%S 3 0 58 0 0 1 93,5%F 1 8 4 52 4 0 75,4%A 0 11 0 4 112 0 88,2%B 8 0 3 1 0 69 85,2%

Polish Database In experiment for Polish language,all 240 recordings (40 of each emotion) were used.There were 86 selected features. Table 3 summarisesnumbers of the features derived from each of the pa-rameters as well as the maximum and average infor-mation gain ratio (IGR) of those features. It can be no-ticed that theLTAShas signi icantly higher average IGRthan theotherparameters. System, using theBayesnet

Tab. 3. Number of features derived from the individualparameters of speech signal for Database of Polish Emo-onal Speech

Parameter No. of feat. Max. IGR Avg. IGRMFCC 41 0.467 0.287Pitch 18 0.458 0.224Intensity 12 0.390 0.278HNR 6 0.320 0.223LTAS 6 0.666 0.454Spectrum 3 0.240 0.200Σ 86

classi ier, properly classi ied 177 (73,75%) instances(64,18% in [6]). Results achieved using other classi-iers were: 71.7% for perceptron, 67.9% for RBF net-work and70.8% for SVM. Table 4 shows confusionma-trix and accuracy of classi ications of each emotion.

3.3. Speaker dependent

Experiments with a speaker dependent recogni-tion were carried out for the recordings from the

62


Tab. 4. Confusion matrix for Database of Polish Emo-onal Speech

Classi ied asN J S F A B

Actualcla

ss

N 31 0 2 3 0 4 77,5%J 2 32 0 0 6 0 80,0%S 0 0 35 1 0 4 87,5%F 1 0 8 21 8 2 52,5%A 0 7 0 4 29 0 72,5%B 3 0 7 1 0 29 72,5%

Database of Polish Emotional Speech. Feature selec-tion and classi ication processwas done separately foreach speaker’s 30 utterances. Length of the featurevectors and classi ication accuracy were summarisedin table 5.

3.4. For social robots

In order to test the developed system for its us-ability for speech emotion recognition for the socialrobots, group of people was asked to utter phrasesexpressing four different emotional states: happiness,compassion, contempt and the neutral state. Thesestatements were intended to meet the most com-mon people’s reactions to short-term contact socialrobot. The social robot Samuel ( igure 3) was givenas an example [3]. It was assumed that in sucha task is better to recognise fewer emotions withthe (expected) better accuracy. In comparison to thediscussed databases, the meaning of phrases corre-sponds to the expressed emotion—it was concludedthat in this case it is more likely. Example sentencesare (English translation in brackets):- Ale wyczesany robot! (What a cool robot!)- Musi ci być smutno tak stać…(Itmust be sad to standlike this…)

- Weźcie to stąd! (Take this away!)- Na podłodze leży dywan. (Carpet lies on the loor).

Recordingswere taken in classroom in presence ofmany people at the same time, so conditions were farfrom the studio both in terms of acoustics and back-ground noise. Electret microphone was used alongwith battery powered preamp and an external USBsound card. Sampling ratewas set at 22050Hz, and theresolutionwas 16b. During experiment 160 sentenceswere recorded (40 for each emotion). One recordingturned out to be corrupted, so it was ruled out.

Test set contained 159 recordings (39 for hap-piness, and 40 for each of other emotions). Systemperformance was evaluated using Bayes net classi ierand tenfold cross-validation. System correctly classi-ied 110 instances (69%). Table 6 shows the confusionmatrix and the accuracy of classi ication of each emo-tion. In the feature selection process 33 features wereselected: 19 derived from MFCC, ive from pitch, fourfrom spectrogram, two from intensity, two from LTASand one from HNR. Differently from previous experi-ments, the highest average IGR have features derivedfrom the intensity (0.269) and those derived from

LTAS havemuch lower avg. IGR (0.201). The lower avg.IGR have HNR (0.164).

4. ConclusionsThis paper presents anddiscusses the speech emo-

tion recognition systembased on the acoustic featuresof speech, apart from its semantic. Veri ication of thesystem using the Berlin Database of Emotional Speechand Polish Database of Emotional Speech con irms theeffectiveness of the chosen feature extraction, selec-tion and classi ication methods. It should be noted,however, that in Polish language fear is clearly lessrecognised than other emotions, so is joy in the Ger-man language. For both languages, joy is often con-fused with anger — these emotions with opposite va-lence both have high arousal and are close to eachother in acoustic feature space. This fact is important,because if the systemwas used by the social robot thistype ofmistake could result in incorrect response. Oneof the solutions could be weighting errors of misclas-si ing different pairs of emotions.

Veri ication for the target application showed thatit is possible for the developed system to recogniseemotions in recordings made in “non-sterile” condi-tions. Achieved recognition accuracy is promising forthe future usage. However, social robot, designed to behuman‘s companion, should be able to recognisemoreemotional states than the short-term contact robot.Therefore, further experiments and research shouldbe carried out, especially concerning noise robust-ness.

AcknowledgementThis research was supported by Wroclaw Univer-

sity of Technology under a statutory grant.

Fig. 3. Social robot Samuel.

63


Tab. 5. Feature vector length and classifica on accuracy for speaker dependent tests.

Speaker M1 M2 M3 M4 F1 F2 F3 F4Length of feature vector 25 26 42 30 39 25 37 28Classi ication accuracy 96,7% 100% 100% 100% 100% 96,7% 100% 100%

Tab. 6. Confusion matrix for social robot experiment.

Classi ied asNeu. Cont. Hap. Comp.

Cor.c

l. N 30 7 1 2 75%Ct 5 30 2 3 75%H 4 9 25 1 64%Cm 8 0 7 25 63%

AUTHORŁukasz Juszkiewicz∗ – Wrocław University of Tech-nology, Institute of Computer Engineering, Controland Robotics, 50-370 Wrocław, Wybrzeże Wyspi-ańskiego 27, e-mail: [email protected].∗Corresponding author

REFERENCES[1] P. Boersma, “Accurate Short-Term Analysis of the

Fundamental Frequency and the Harmonics-to-Noise Ratio of a Sampled Sound”, Institute of Pho-netic Sciences, University of Amsterdam, Proceed-ings, vol. 17, 1993, pp. 97–110.

[2] R. R. Bouckaert, “Bayesian network classi iers inweka for version 3-5-7”, Network, 2008.

[3] R. Budziński, J. Kędzierski, andB.Weselak. “Headof social robot Samuel – construction (in Pol-ish)”. In: Prace Naukowe Politechniki Warsza-wskiej, Elektronika, pp. 185–194, z. 175, t. I.O icyna Wydawnicza Politechniki Warszawskiej,2010.

[4] F. Burkhardt, A. Paeschke, M. Rolfes, W. F.Sendlmeier, and B. Weiss. A database of germanemotional speech, volume 2005, pp. 3–6. Cite-seer, 2005.

[5] S. Casale, A. Russo, G. Scebba, and S. Serrano,“Speech emotion classi ication using machinelearning algorithms”. In: Proceedings of the 2008IEEE International Conference on Semantic Com-puting, Washington, DC, USA, 2008, pp. 158–165.

[6] J. Cichosz and Ślot K., “Emotion recognition inspeech signal using emotion-extracting binarydecision trees”. In: Proceedings of the 2nd Inter-national Conference on Affective Computing andIntelligent Interaction (ACII): Doctoral Consor-tium, 2007.

[7] G. F. Cooper andT.Dietterich, “Abayesianmethodfor the induction of probabilistic networks fromdata”. In:Machine Learning, 1992, pp. 309–347.

[8] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis,G. Votsis, S. Kollias, W. Fellenz, and J. G. Taylor,

“Emotion recognition in human-computer inter-action”, IEEE Signal Processing Magazine, vol. 18,no. 1, 2001, pp. 32–80.

[9] K. Dautenhahn. Socially intelligent agents: cre-ating relationships with computers and robots,chapter Creating emotion recognition agents forspeech signal. Multiagent systems, arti icial soci-eties, and simulated organizations. Kluwer Aca-demic Publishers, 2002.

[10] M. Hall, E. Frank, G. Holmes, B. Pfahringer,P. Reutemann, and I. H. Witten, “The weka datamining software: an update”, SIGKDD Explor.Newsl., vol. 11, 2009, pp. 10–18.

[11] M. A. Hall. Correlation-based Feature SubsetSelection for Machine Learning. PhD thesis,Department of Computer Science, University ofWaikato, Hamilton, New Zealand, April 1999.

[12] S. Haykin, Advances in spectrum analysis and ar-ray processing, Prentice Hall advanced referenceseries: Engineering, Prentice Hall, 1995.

[13] E. Keller, “The analysis of voice quality in speechprocessing”. In: Lecture Notes in Computer Sci-ence, 2005, pp. 54–73.

[14] P. Langley and S. Sage, “Induction of selectivebayesian classi iers”. In: Conference on uncertain-ity in arti icial intelligence, 1994, pp. 399–406.

[15] Lodz University of Technology, Medical Elec-tronics Division. “Database of Polish EmotionalSpeech”. http://www.eletel.p.lodz.pl/bronakowski/med_catalog/docs/licence.txt.

[16] X. Mao, B. Zhang, and Y. Luo, “Speech emotionrecognition based on a hybrid of HMM/ANN”. In:Proceedings of the 7th Conference on 7th WSEASInternational Conference on Applied Informaticsand Communications - Volume 7, Stevens Point,Wisconsin, USA, 2007, pp. 367–370.

[17] I.Murray and J. Arnott, “Toward the simulation ofemotion in synthetic speech: a review of the lit-erature on human vocal emotion”, Journal of theAcoustic Society of America, vol. 93, no. 2, 1993,p. 1097–1108.

[18] T. L. Nwe, S. W. Foo, and L. C. D. Silva, “Speechemotion recognition using hidden markov mod-els”, Speech Communication, vol. 41, 2003, pp.603–623.

[19] A. Osherenko and E. André, “Differentiated se-mantic analysis in lexical affect sensing”, 20093rd International Conference on Affective Com-puting and Intelligent Interaction andWorkshops,2009, pp. 1–6.

64

http://www.eletel.p.lodz.pl/bronakowski/med_catalog/docs/licence.txt




[20] D. W. Paul Boersma. “Praat: doing phonetics bycomputer (version 5.2.05)”, 2010.

[21] S. Samad, A. Hussain, and L. K. Fah, “Pitch detec-tion of speech signals using the cross-correlationtechnique”. In: TENCON 2000. Proceedings, vol. 1,2000, pp. 283 – 286.

[22] S. Schnall, “The pragmatics of emotion language”,Psychological Inquiry, vol. 16, no. 1, 2005, pp.28–31.

[23] B. Schuller, A. Batliner, D. Seppi, S. Steidl, T. Vogt,J. Wagner, L. Devillers, L. Vidrascu, N. Amir,L. Kossous, and V. Aharonson, “The relevanceof feature type for the automatic classi icationof emotional user states: Low level descriptorsand functionals”. In: Proceedings of Interspeech,Antwerp, Belgium, 2007.

[24] B. Schuller, S. Reiter, R. Muller, M. Al-Hames,M. Lang, and G. Rigoll, “Speaker independentspeech emotion recognition by ensemble classi-ication”. In: Multimedia and Expo, 2005. ICME2005. IEEE International Conference on, 2005, pp.864–867.

[25] B. Siciliano and O. Khatib, eds., Springer Hand-book of Robotics, Springer, 2008.

[26] J. Sidorova, “Speech emotion recognition withTGI+.2 classi ier”. In:Proceedings of the 12th Con-ference of theEuropeanChapter of theAssociationfor Computational Linguistics: Student ResearchWorkshop, 2009, pp. 54–60.

[27] S. S. Stevens, J. Volkmann, and E. B. Newman, “Ascale for the measurement of the psychological

magnitude pitch”, Journal of the Acoustical Soci-ety of America, vol. 8, no. 3, 1937, pp. 185–190.

[28] Łukasz Juszkiewicz. Postępy robotyki, chap-ter Speech emotion recognition system for so-cial robots (in Polish), pp. 695–704. O icynawydawnicza PW, 2012.

[29] T. Vogt. Real-time automatic emotion recognitionfrom speech. PhD thesis, Technischen Fakultätder Universität Bielefeld, 2010.

[30] Z. Xiao, E. Dellandréa, W. Dou, and L. Chen. “Hi-erarchical Classi ication of Emotional Speech”.Technical Report RR-LIRIS-2007-006, LIRISUMR 5205 CNRS/INSA de Lyon/UniversitéClaude Bernard Lyon 1/Université LumiéreLyon 2/École Centrale de Lyon, March 2007.

[31] M. You, C. Chen, J. Bu, J. Liu, and J. Tao, “A hier-archical framework for speech emotion recog-nition”. In: Industrial Electronics, 2006 IEEEInternational Symposium on, vol. 1, 2006, pp.515–519.

[32] S. Zhang and Z. Zhao, “Feature selection ilter-ing methods for emotion recognition in chinesespeech signal”. In: Signal Processing, 2008. ICSP2008. 9th International Conference on, 2008, pp.1699–1702.

[33] J. Zhou, G. Wang, Y. Yang, and P. Chen, “Speechemotion recognition based on rough set andsvm.”. In: Y. Yao, Z. Shi, Y. Wang, and W. Kinsner,eds., IEEE ICCI, 2006, pp. 53–61.

65

jamris 2013 vol 7 no 4

Documents

university wrzburg

bengurion university

tokyo institute of technology

industrial research

ukjan jablkowski piap

mexicotadeusz missala

magorzata kaliczynska

canadaroman szewczyk