distributed evolution for swarm robotics

Distributed Evolution for Swarm Robotics

Suranga HettiarachchiComputer Science Department

University of Wyoming

Committee Members:Dr. William Spears – Computer Science (Committee Chair / Research Advisor)Dr. Diana Spears – Computer ScienceDr. Thomas Bailey – Computer ScienceDr. Richard Anderson-Sprecher – StatisticsDr. David Thayer – Physics and Astronomy

Outline

• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work

Goals

• To improve the state-of-the-art of obstacle avoidance in swarm robotics.

• To create a novel real-time learning algorithm for swarm robotics, to improve performance in changing environments.

Contributions

• Improved performance in obstacle avoidance:• Scales to far higher numbers of robots and obstacles than the

norm• Invented an online population-based learning

algorithm:• Demonstrate feasibility of algorithm with obstacle avoidance, in

environments that change dynamically and are three times denser than the norm, with obstructed perception

• Hardware Implementation• Implemented obstacle avoidance algorithm on real robots

Obstacle Avoidance

Hardware Implementation

Online Learning Algorithm

Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work

Robot Swarms

• Robot swarms can act as distributed computers, solving problems that a single robot cannot

• For many tasks, having a swarm maintain cohesiveness while avoiding obstacles and performing the task is of vital importance

• Example Task: Chemical Plume Source Tracing

Chemical Plume Source Tracing

Link to this movie may not work properly

Physicomimetics for Robot Control

•Biomimetics: Gain inspiration from biological systems and ethology.

•Physicomimetics: Gain inspiration from

physical systems. Good for formations.

Physicomimetics Framework

Robots have limited sensor range,and friction for stabilization

F F

F F F

Virtual forces F on a robot A by other robots ai and the environment cause a

d displacement in its behavior.

d

a 1

a 2

a 3

a4

A

Environment

Robots are controlled via “virtual” forces from nearby robots, goals, and obstacles. F = ma control law.

Seven robots form a hexagon

Two Classes of Force Laws

p

ji

r

mGmF

7

6

13

12224

r

c

r

dF

The left “Newtonian” force law, is good for creating swarms in rigid formations. The right “Lennard-Jones” force law (LJ) more easily models fluid behavior, which is potentially better for maintaining cohesion while avoiding obstacles.

The “classic” law Novel use of LJ force law for robot control

What do these force laws look like?

Change in Force MagnitudeWith Varying Distance for Robot – Robot Interactions

Fmax = 1.0

Fmax = 4.0

Desired Robot Separation Distance = 50

Outline

• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work

Swarm Learning (Offline)

• Typically, the interactions between the swarm robots are learned via simulation in “offline” mode.

Swarm Simulation

Initial RulesFinal Rules

that achieve thedesired behavior

Offline Learning, such as an Evolutionary Algorithm (EA)

FitnessRules

Swarm Simulation Environment

Offline Learning Approach

• An Evolutionary Algorithm (EA) is used to evolve the rules for the robots in the swarm.

• A global observer assigns fitness to the rules based on the collective behavior of the swarm in the simulation.

• Each member of the swarm uses the same rules. The swarm is a homogeneous distributed system.

• For physicomimetics, the rules consists of force law parameters.

Force Law Parameters• Parameters of the “Newtonian” force law

G- “gravitational” constant of robot-robot interactionsP- power of the force law for robot-robot interactionsFmax- maximum force of robot-robot interactions

Similar 3-tuples for obstacle/goal-robot interactions.

• Parameters of the LJ force lawε- strength of the robot-robot interactionsc- non-negative attractive robot-robot parameterd- non-negative repulsive robot-robot parameterFmax- maximum force of robot-robot interactions

Similar 4-tuples for obstacle/goal-robot interactions.

Gr-r Pr-r Fmaxr-r Gr-o Pr-o Fmaxr-o Gr-g Pr-g Fmaxr-g

εr-r cr-r dr-r Fmaxr-r εr-o cr-o dr-o Fmaxr-o εr-g cr-g dr-g Fmaxr-g

Measuring Fitness • Connectivity (Cohesion) : maximum number of

robots connected via a communication path. • Reachability (Survivability) : percentage of

robots that reach the goal.• Time to Goal : time taken by at least 80% of the

robots to reach the goal.

goalconnectivity4R

reachability

High fitness corresponds to high connectivity,high reachability, and low time to goal.

Summary of Results

• We compared the performance of the best “Newtonian” force law found by the EA to the best LJ force law.

• The “Newtonian” force law produces more rigid structures making it difficult to navigate through obstacles. This causes poor performance, despite high connectivity.

• Lennard-Jones is superior, because the swarm acts as a viscous fluid. Connectivity is maintained while allowing the robots to reach the goal in a timely manner.

• The Lennard-Jones force law demonstrates scalability in the number of robots and obstacles.

Connectivity of Robots

Force Law

Robots

Obstacles

20 40 60 80 100

Newt20 1160 1260 1290 1530 1920

100 - - - - -

LJ20 470 480 490 510 520

100 640 650 670 680 690

Time for 80% of the Robots to Reach the Goal

A Problem

• The simulation assumes a certain environment. What happens if the environment changes when the swarm is fielded?• We can’t go back to the simulation world.• Can the swarm adapt “on-line” in the field?

Environment trained on.

Environment changes.

Performance degrades.

Frequently Proposed Solution

• Each robot has sufficient CPU power and memory to maintain a complete map of the environment.

• When environment changes, each robot runs an EA internally, on a simulation of the new environment.

• Robots wait until new rules are evolved.

• It is better to learn in the field, in real time.

4 days of simulation time

Example• The maximum velocity is increased by 1.5x.• Obstacles are tripled in size.• High obstacle density creates cul-de-sacs and

robots are left behind. Collisions also occur.• Obstructed perception is also introduced.• The learned offline rules are no longer sufficient.

Environment trained on.

Environment changes.

Performance degrades.

Novel Online Learning Approach• Borrow from evolution.

• Each robot in the swarm is an individual in a population that interacts with its neighbors.

• Each robot contains a slightly mutated copy of the best rule set found with offline learning.

• When the environment changes, some mutations perform better than others.

• Better performing robots share their knowledge with poorer performing neighbors.

• We call this “Distributed Agent Evolution with Dynamic Adaptation to Local Unexpected Scenarios” (DAEDALUS).

DAEDALUS for Obstacle Avoidance

• Each robot is initialized with randomly perturbed (via mutation) versions of the force laws learned with the offline simulation.

• Robots are penalized if they collide with obstacles and/or are left behind.

• Robots that are most successful and are moving will retain the highest worth, and share their force laws with neighboring robots that were not as successful.

Experimental Setup

• There are five goals to reach in a long corridor.• Between each goal is a different obstacle

course.• Robots that are left behind (due to obstacle cul-

de-sacs) do not proceed to the next goal.• The number of robots that survive to reach the

last goal is low. We want the robots to learn to do better, while in the field.

DAEDALUS Results• DAEDALUS succeeded in dramatically

reducing the number of collisions and improving survivability, despite the difficulties caused by obstructed perception.

• Our results depended on the mutation rate. Can DAEDALUS learn that also?

20 minutes of simulation time

Further DAEDALUS Results• DAEDALUS also succeeded in learning the

appropriate mutation rate for the robots. Hence, the system is striking a balance between exploration and exploitation.

Number of Robots Surviving with Different Mutation Rates

1% 3% 5% 7% 9%

60-start 12 12 12 12 12

53-goal1 8 10 11 12 12

45-goal2 9 6 10 9 11

40-goal3 7 6 10 8 9

34-goal4 5 6 9 8 6

32-goal5 5 5 9 7 6

Effect of Mutation Rate on Survival

60 Robots moving towards 5 goals through 90 obstacles in between each goal

Collision Reduction

Summary of DAEDALUS

• Creating rapidly adapting robots in changing environments is challenging.

• Offline learning can yield initial “seed” rules, which must then be perturbed.

• The key is to maintain “diversity” in the rules that control the members of the swarm.

• Collective behaviors still arise from the local interactions of diverse population of robots.

Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Traditional Offline Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical

Robots• Conclusion and Future Work

Obstacle Avoidance with Robots

• Use three Maxelbot robots• Use 2D trilateration localization

algorithm (Not a part of this thesis)• Design and develop obstacle

avoidance module (OAM)• Implement physicomimetics on a real

outdoor robot

Hardware Architecture of Maxelbot

MiniDRAGON for motor control,

executes Physicomimetics

MiniDRAGON for trilateration,

provides robot coordinates

OAMAtoD conversion

RF and acoustic sensors

IR sensors

I2C

I2C

I2C

Physicomimetics for Obstacle Avoidance

• Constant “virtual” attractive goal force in front of the leader

• “Virtual” repulsive forces from four sensors mounted on the front of the leader, if obstacles detected

• The resultant force creates a change in velocity due to F = ma

• Power supply to motors are changed based on the forces acting on the leader.

Obstacle Avoidance Methodology• Measure the performance of physicomimetics

with repulsion from obstacles • All experiments are conducted outdoor in the

“Prexy’s Pasture”• Three Maxelbots: One leader and two followers• Graphs show the correlation between raw

sensor readings and motor power• Leader uses the physicomimetics algorithm

with the obstacle avoidance module• Focus is on the obstacle avoidance by the

leader, not the formation control

Maxelbot Turning Left - Obstacle on the Right

-100

0

100

200

300

400

500

600

700

800

1 1001 2001 3001 4001 5001 6001 7001 8001 9001

Time

Se

ns

or

Re

ad

ing

an

d M

oto

r P

ow

er

Right-most Sensor Reading

Power to Left Motor

If there is an obstacle on the right, power to left motor is reduced

Maxelbot Turning Right - Obstacle on the Left

-100

0

100

200

300

400

500

600

700

800

1 1001 2001 3001 4001 5001 6001 7001 8001 9001

Time

Sen

sor

Rea

din

g a

nd

Mo

tor

Po

wer

Left-most Sensor Reading

Power to Right Motor

If there is an obstacle on the left, power to right motor is reduced

If there is an obstacle in front, power to both motors is reduced

Maxelbot Stopping Behavior - Both Middle Sensors Detect an Obstacle

-100

0

100

200

300

400

500

600

700

800

1 1001 2001 3001 4001 5001 6001 7001 8001 9001

Time

Se

ns

or

Re

ad

ing

s a

nd

Mo

tor

Po

we

r

Ave. of the Two Middle Sensors

Ave. of the Motor Power

Further Analysis of Sensor Reading and Motor Power

• Scatter plots give more information

• Provide a broader picture of data

• Shows the correlation of motor power with distance to an obstacle in inches (the robots ignore obstacles greater than 30” away)

Movie of 3 Maxelbots, Leader has OAM


-20

-10

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90 100

Distance to obstacle on the left in inches

Po

we

r to

Rig

ht

Mo

tor

Left sensorsees obstacle

Left middle sensoralso sees obstacle

Contributions• Improved performance in obstacle avoidance:

• Applied a new force law for robot control, to improve performance• Provided novel objective performance metrics for obstacle avoiding

swarms• Improved scalability of the swarm in obstacle avoidance• Improved performance of obstacle avoidance with obstructed

perception• Invented a real-time learning algorithm (DAEDALUS):

• Demonstrate that a swarm can improve performance by mutating and exchanging force laws

• Demonstrate feasibility of DAEDALUS with obstacle avoidance, in environments three times denser than the norm

• Explore the trade-offs of mutation on homogeneous and heterogeneous swarm learning

• Hardware Implementation• Present a novel robot control algorithm that merges

physicomimetics with obstacle avoidance.

Future Work• Use DAEDALUS to provide practical solutions to real world problems• Provide obstacle avoidance capability to all the robots in the formation• Develop robots with greater data exchange capability• Adapt the physicomimetics framework to incorporate performance feedback for specific tasks and situational awareness • Extend the physicomimetics framework for sensing and performing tasks in a marine environment (with Harbor Branch)• Introduce robot/human roles and interactions to distributed evolution architecture

Work Published• Spears W., Spears D., Heil R., Kerr W. and Hettiarachchi S. An overview of

physicomimetics. Lecture Notes in Computer Science - State of the Art Series Volume 3342, 2004. Springer.

• Hettiarachchi S. and Spears W., Moving swarm formations through obstacle fields. Proceedings of the 2005 International Conference on Artificial Intelligence, Volume 1, 97-103, CSREA Press.

• Hettiarachchi S., Spears W., Green D., and Kerr W., Distributed agent evolution with dynamic adaptation to local unexpected scenarios . Proceedings of the 2005 Second GSFC/IEEE Workshop on Radical Agent Concepts. Springer.

• Spears, W., D. Zarzhitsky, S. Hettiarachchi, W. Kerr. Strategies for multi-asset surveillance. IEEE International Conference on Networking, Sensing and Control, 2005, 929-934. IEEE Press.

• Hettiarachchi, S. and W. Spears (2006). DAEDALUS for agents with obstructed perception. In SMCals/06 IEEE Mountain Workshop on Adaptive and Learning Systems, pp. 195-200. IEEE Press, Best Paper Award.

• Hettiarachchi, S. (2006). Distributed online evolution for swarm robotics. In Doctoral Mentoring Program AAMAS06, T. Ishida and A. B. Hassine (Eds.), Autonomous Agents and Multi Agent Systems, pp. 17-18..

• Hettiarachchi, S., P. Maxim, and W. Spears (2007). An architecture for adaptive swarms. In Robotics Research Trends, X. P Guo (Ed.). Nova Publishers (Book Chapter).

Thank You

Questions?

Backup Slides

Next set of slides may be confusing because they are intended to be placed between the slides from 1-49.

DAEDALUS for Reducing Collisions

• Slightly mutate robot-obstacle force law interactions.

• Those robots that do not collide give their force laws to poorer performing robots.

DAEDALUS for Improving Survival

• Previous experiment did not attempt to alleviate the situation where robots are left behind.

• This is caused by large number of cul-de-sacs produced by large obstacle density.

• Slightly mutate robot-robot interaction, if there is a nearby moving neighbor.

• Rapidly mutate robot-goal interaction, if there are no neighbors.

Improved Survival

Two Online experiments are independent from each other.

Task: Obstacle Avoidance with Obstructed Perception

goalRobots must organize themselves into aformation and then movetoward a goal, while avoiding obstacles.

•A robot may not see another robot, due to the presence of obstacles.•If r > minD, then robot A and robot B have their perception obstructed.

DAEDALUS Results

Results averaged over 100 independent runs

We do not train children on hard problems immediately, instead, we train them on easier problems first. This is counter to accepted wisdom in the EA community.

DAEDALUS online learning is improving performance.

Homogeneous DAEDALUS

• All robots had the same mutation rate, which was 5%.

• The results may depend quite heavily on choosing the correct mutation rate.

• The best mutation rate may also depend on the environment, and should potentially change as the environment changes.

• We decided to explore this effect by conducting several experiments with different mutation rates.

Heterogeneous DAEDALUS• We attempted to address the problem of

choosing the correct mutation rate.• We divided the robots into five groups of

equal size.• Each group of 12 robots was assigned a

mutation rate of 1%, 3%, 5%, 7%, and 9%, respectively.

• This mimics the behavior of children that have different “comfort zones” in their rate of exploration.

Heterogeneous Results


The result at the final goal is essentially identicalto the average of the five performance curves in the previous graph. Can DAEDALUS learn the proper “comfort zone”, instead?

Analogy – Children Learning

• Borrowed from the analogy of a “swarm” of children learning some task.

• They share useful information as to the rules they might use, but they also share meta-information as to the level of exploration that is actually safe!

• Very bold children might encourage their more timid comrades to explore more than they would initially.

• If a very bold child has an accident, the rest of the children will become more timid.

Extended Heterogeneous DAEDALUS - Results


DAEDALUS nowallows the robots to receive a neighbor’smutation rate, in addition to the neighbor’s rules.The results are closeto those achieved by the homogenous DAEDALUS with the best mutation rate!

Why Physicomimetics?

• Capable of maintaining formations of robots

• Designed as a leader-follower algorithm

• Allows robots to move quickly, due to minimal communication

• Can use theory to set parameters

Physcomimetics for Formation Control

• The leader provides an attractive goal force for the followers

• The follower uses F = ma to compute the change in velocity that is required to follow the leader

• Power supply to motors are changed based on the changes in velocity

Formation Control Methodology• Measure the quality of Physicomimetics without

repulsions from obstacles • All experiments are conducted outdoor in the

“Prexy’s Pasture”• Three Maxelbots: One leader and two followers• Results averaged over 10 runs• Leader remotely controlled (NO Physicomimetics)• Leader DO NOT have obstacle avoidance

capability• Focus is on the formation control, not the

obstacle avoidance

Triangular Formation

Triangular Formation Results

Linear Formation

Linear Formation Results


-20

-10

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90 100

Distance to Obstacle (inches)

Po

we

r to

Rig

ht

Mo

tor

Lag in stopping due to physicomimetic inertia.Helps counteract noisy sensors.

Lag in starting due to physicomimetic inertia.Helps counteract noisy sensors.

Left sensorsees obstacle

Left middle sensorsees obstacle

Maxelbot Turning Left - Obstacle on the Right

-20

-10

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90 100


Po

we

r to

Le

ft M

oto

r

Lag in starting due to AP inertia.Helps counteract noisy sensors.

Lag in stopping due to AP inertia.Helps counteract noisy sensors.

Right sensorsees obstacle

Right middle sensorsees obstacle

Maxelbot Stopping Behavior - Both Middle Sensors Detect an Obstacle

0

10

20

30

40

50

60

70

80

0 10 20 30 40 50 60 70 80 90 100


Av

era

ge

of

Le

ft a

nd

Rig

ht

Mo

tor

Po

we

r

Power will be reduced if theoutermost sensors see anobstacle when the innersensors do not.

distributed evolution for swarm robotics

Documents

robot robot interactionsfmax

physical robotsconclusion

robot controlbiomimetics

robot controlwhat

newtonian force law

physical systems

distributed evolution

distributed computers