genetic algorithms jelena mirković, aleksandra popović, dražen drašković, veljko milutinović...

53
Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko Bajec Faculty of Computer & Information Science University of Ljubljana

Upload: clarissa-sharp

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Genetic Algorithms

Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović

School of Electrical Engineering,University of Belgrade

Marko BajecFaculty of Computer & Information Science

University of Ljubljana

Page 2: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

2 / 53

What You Will Learn From This Tutorial?

What is a genetic algorithm? Principles of genetic algorithms. How to design an algorithm? Comparison of gas and conventional algorithms.

Applications of GA– GA and the internet – GA and image segmentation– GA and system design

Genetic programming

Part II

Part I

Part III

Page 3: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Part I: GA Theory

What are genetic algorithms?

How to design a genetic algorithm?

Page 4: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

4 / 53

Genetic Algorithm Is Not...

...Gene coding

Page 5: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

5 / 53

Genetic Algorithm Is...

… Computer algorithm

That resides on principles of genetics and evolution

Page 6: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

6 / 53

Instead of Introduction...

Hill climbing

local

global

Page 7: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

7 / 53

Instead of Introduction…(2)

Multi-climbers

Page 8: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

8 / 53

Instead of Introduction…(3) Genetic algorithm

I am not at the top.My high is better!

I am at the top

Height is ...

I will continue

Page 9: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

9 / 53

Instead of Introduction…(3)

Genetic algorithm - few microseconds after

Page 10: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

10 / 53

The GA Concept

Genetic algorithm (GA) introduces the principle of evolution and genetics into search among possible solutions to a given problem.

The idea is to simulate the process in natural systems. This is done by the creation within a machine

of a population of individuals represented by chromosomes, in essence a set of character strings,that are analogous to the DNA,that we have in our own chromosomes.

Page 11: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

11 / 53

Survival of the Fittest

The main principle of evolution used in GA is “survival of the fittest”.

The good solution survive, while bad ones die.

Page 12: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

12 / 53

Nature and GA...

Nature reality Genetic algorithm

Chromosome String

Gene Character

Locus String position

Genotype Population

Phenotype Decoded structure

Page 13: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

13 / 53

The History of GA

Cellular automata – John Holland, university of Michigan, 1975.

Until the early 80s, the concept was studied theoretically. In 80s, the first “real world” GAs were designed.

Page 14: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

14 / 53

Algorithmic Phases

Initialize the population

Select individuals for the mating pool

Perform crossover

Insert offspring into the population

The End

Perform mutation

yes

no

Stop?

Page 15: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Designing GA...

How to represent genomes? How to define the crossover operator? How to define the mutation operator? How to define fitness function? How to generate next generation? How to define stopping criteria?

Page 16: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

16 / 53

Representing Genomes...

Representation Example

string 1 0 1 1 1 0 0 1

array of strings http avala yubc net ~apopovic

tree - genetic programming>

bxor

or

c

ba

Page 17: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

17 / 53

Crossover

Crossover is concept from genetics. Crossover is sexual reproduction. Crossover combines genetic material from two parents,

in order to produce superior offspring. Few types of crossover:

– One-point– Multiple point.

Page 18: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

One-point Crossover

Parent #1Parent #2

0

1

5

3

5

4

7

6

7

6

2

4

2

3

0

1

Page 19: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

One-point Crossover

Parent #1Parent #2

0

1

5

3

5

4

7

6

7

6

2

4

2

3

0

1

Page 20: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

20 / 53

Mutation

Mutation introduces randomness into the population. Mutation is asexual reproduction. The idea of mutation

is to reintroduce divergence into a converging population.

Mutation is performed on small part of population,in order to avoid entering unstable state.

Page 21: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

21 / 53

Mutation...

1 1 0 1 0 10 0

0 1 0 1 0 10 1

1 0

0 1

Parent

Child

Page 22: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

22 / 53

About Probabilities...

Average probability for individual to crossoveris, in most cases, about 80%.

Average probability for individual to mutate is about 1-2%.

Probability of genetic operators follow the probability in natural systems.

The better solutions reproduce more often.

Page 23: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

23 / 53

Fitness Function

Fitness function is evaluation function,that determines what solutions are better than others.

Fitness is computed for each individual. Fitness function is application depended.

Page 24: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

24 / 53

Selection

The selection operation copies a single individual, probabilistically selected based on fitness, into the next generation of the population.

There are few possible ways to implement selection:– “Only the strongest survive”

• Choose the individuals with the highest fitness for next generation

– “Some weak solutions survive”• Assign a probability that a particular individual

will be selected for the next generation• More diversity• Some bad solutions might have good parts!

Page 25: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

25 / 53

Selection - Survival of The Strongest

0.93 0.51 0.72 0.31 0.12 0.64

Previous generation

Next generation

0.93 0.72 0.64

Page 26: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

26 / 53

Selection - Some Weak Solutions Survive

0.93 0.51 0.72 0.31 0.12 0.64

Previous generation

Next generation

0.93 0.72 0.64 0.12

0.12

Page 27: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Mutation and Selection...

Phenotype

D

Phenotype

D

Phenotype

D

Selection Mutation

Solution distribution

Page 28: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

28 / 53

Stopping Criteria

Final problem is to decide when to stop execution of algorithm.

There are two possible solutions to this problem: – First approach:

• Stop after production of definite number of generations

– Second approach: • Stop when the improvement in average fitness

over two generations is below a threshold

Page 29: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

29 / 53

GA vs. Ad-hoc Algorithms

Genetic Algorithm Ad-hoc Algorithms

Speed

Human work

Applicability

Performance

Slow * Generally fast

Minimal Long and exhaustive

GeneralThere are problems

that cannot be solved analytically

Excellent Depends

* Not necessary!

Page 30: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

30 / 53

Problems With GAs

Sometimes GA is extremely slow, and much slower than usual algorithms

Page 31: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

31 / 53

Advantages of GAs

Concept is easy to understand. Minimum human involvement. Computer is not learned how to use existing solution,

but to find new solution! Modular, separate from application Supports multi-objective optimization Always an answer; answer gets better with time !!! Inherently parallel; easily distributed Many ways to speed up and improve a GA-based application as

knowledge about problem domain is gained Easy to exploit previous or alternate solutions

Page 32: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

32 / 53

GA: An Example - Diophantine Equations

Diophantine equation (n=4):

A*x + b*y + c*z + d*q = s

For given a, b, c, d, and s - find x, y, z, q

Genome:

(X, y, z, p) = x y z q

Page 33: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

33 / 53

GA: An Example - Diophantine Equations(2)

Crossover

Mutation

( 1, 2, 3, 4 )

( 5, 6, 7, 8 )

( 1, 6, 3, 4 )

( 5, 2, 7, 8 )

( 1, 2, 3, 4 ) ( 1, 2, 3, 9 )

Page 34: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

34 / 53

GA: An Example - Diophantine Equations(3)

First generation is randomly generated of numbers lower than sum (s).

Fitness is defined as absolute value of difference between total and given sum:

Fitness = abs (total - sum) ,

Algorithm enters a loop in which operators are performed on genomes: crossover, mutation, selection.

After number of generation a solution is reached.

Page 35: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Some Applications of GAs

GAInternet search

Data mining

Software guided circuit designControl systems design

Stock prize prediction

Path finding Mobile robotssearch

Optimization

Trend spotting

Page 36: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Part II: Applications of GAs

GA and the Internet

GA and image segmentation

GA and system design

Page 37: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Genetic Algorithm and the Internet

Page 38: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

38 / 53

Introduction

GA can be used for intelligent internet search. GA is used in cases when search space

is relatively large. GA is adoptive search. GA is a heuristic search method.

Page 39: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Search Engines & Web Crawlers

39 / 53

Search Engine

WebCrawler

instructs IndexedDB

Collects & stores

data to

Provides data for

Generic Web

Crawler

FocusedWeb

Crawler

isimplemented

as

Get all pages;Breadth-first algorithm

Get only „best“ pages…;Best-first algorithm

Page 40: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

40 / 53

Focused Web Crawlers

FocusedWeb

CrawlerDomain-focused Search

Web Analysis

algos

WebSearchalgos

Relayson

Content-based web analysis

(e.g., Vector Space Model)

Link-based web analysis

(e.g., PageRank, HITS)

Breadth-first search(depends on a good

selection of seed pages…Problems with performance!)

Best-first search(try to predict what linkslead to quality pages…

Problems: „local search“ (LS))

Page 41: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

41 / 53

Using GA to Solve LS problem

List of search key wordsSearch

Generation 0 Seed pages…..

Apply SELECTION, CROSSOVER, MUTATION

…..Generation 1

…..Generation 2

Apply SELECTION, CROSSOVER, MUTATION

…..Generation n Result pages

CONVERGENCE

Page 42: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Example of a GA algorithm

SELECTION– Calculate Fitness value for each page C – FV(c);– Randomly select members of the generation by taking into account the

distribution of FV – selected pages; CROSSOVER

– Extract all URLs from the selected pages;– Calculate Crossover value for each URL u C(u)

C(u) = sum(FV(p)) for all pages p that have a link to u; MUTATION

– Select random 3 words from the domain lexicon– Search over popular search engines and extract top x pages -

mutation pages;– Create next generation by combining selected and mutation pages;

CONVERGENCE– Repeat steps 1 to 3 – Until the num of pages with Fvtreshold reaches the pre-set number. 42 / 53

Page 43: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

43 / 53

Algorithm Phases

Process set of URLs given by user

Select all links from input set

Evaluate fitness function for all genomes

Perform crossover, mutation, and reproduction

Satisfactorysolution

obtained?

The End

Page 44: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

44 / 53

A System for the GA Internet Search Essence:

If “desperate,” do database mutationIf “happy,” do locality based mutation

CONTROL

PROGRAM

Agent Spider

Input set

Topic

Space

Time

Output set

Current set

Top data

Net data

Generator

Page 45: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

45 / 53

Spider

Spider is software packages, that picks up internet documents from user supplied input with depth specified by user.

Spider takes one URL, fetches all links, and documents thy contain with predefined depth.

The fetched documents are stored on local hard disk with same structure as on the original location.

Spider’s task is to produce the first generation. Spider is used during crossover and mutation.

Page 46: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

46 / 53

Agent

Agent takes as an input a set of urls, and calls spider, for every one of them, with depth 1.

Then, agent performs extraction of keywords from each document, and stores it in local hard disk.

Page 47: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

47 / 53

Generator

Generator generates a set of urls from given keywords, using some conventional search engine.

It takes as input the desired topic, calls yahoo search engine, and submits a query looking for all documents covering the specific topic.

Generator stores URL and topic of given web page in database called topdata.

Page 48: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

48 / 53

Topic

It uses topdata DB in order to insert random urls from database into current set.

Topic performs mutation.

Page 49: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

49 / 53

Space

Space takes as input the current set from the agent application and injects into it those urls from the database netdata that appeared with the greatest frequency in the output set of previous searches.

Page 50: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

50 / 53

Time

Time takes set of urls from agent and inserts ones with greatest frequency into DB netdata.

The netdata DB contains of three fields: URL, topic, and count number.

The DB is updated in each algorithm iteration.

Page 51: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

51 / 53

How Does the System Work?

CONTROL

PROGRAM

Agent Spider

Input set

Topic

Space

Time

Output set

Current set

Top data

Net data

Generator

command flow

data flow

Page 52: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

52 / 53

GA and the Internet: Conclusion

GA for internet search, on contrary to other gas,is much faster and more efficient that conventional solutions,such as standard internet search engines.

INTERNET

Page 53: Genetic Algorithms Jelena Mirković, Aleksandra Popović, Dražen Drašković, Veljko Milutinović School of Electrical Engineering, University of Belgrade Marko

Conclusion: Evolution of Future Research

53 / 53