hugojairescalante multi-objective evolutionary...

Post on 19-Jan-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

MULTI-OBJECTIVE EVOLUTIONARY

OPTIMIZATION: FEW APPLICATIONS

Hugo Jair Escalante

Contents

• Single and multi objective optimization

• Multi-objective evolutionary algorithms (NSGA-II)

• Maximizing diversification of search results• Maximizing diversification of search results

• Prototype generation for classification

• Discussion

SINGLE/MULTI OBJECTIVE

OPTIMIZATION

Multi-objective Evolutionary Algorithms: two applications

http://en.wikipedia.org/wiki/Mathematical_optimization

• A single-objective optimization problem can

be defined as:

min f(x)

Single-objective optimization

min f(x)

s.t. gi(x) ≤ 0 for i = {1,…,I}

hj(x) = 0 for j = {1,…,J}

xlk ≤ xk ≤ xu

k for k = {1,…,n}

Single-objective optimization

Brian Birge’s PSO demo for matlab

Función: Rosenbrock

Single-objective optimization

• In this type of problems we want to find a

solution x* associated to an extreme value of

f. There are different types of methods for

approaching this problems (e.g., gradient-approaching this problems (e.g., gradient-

based, simplex, heuristic, etc. )

• A multi-objective optimization problem can be

defined as:

min f(x) = ‹f (x), …, f (x)›

Multi-objective optimization

min f(x) = ‹f1(x), …, fN(x)›

s.t. gi(x) ≤ 0 for i = {1,…,I}

hj(x) = 0 for j = {1,…,J}

xlk ≤ xk ≤ xu

k for k = {1,…,n}

Multi-objective optimization

Decision space Objectives space

f2(x)

f1(x)

Multi-objective optimization

• In MOO we deal with problems involving more

than one objective. Hence a good candidate

solution to solve the problem must return

acceptable values for all of the consideredacceptable values for all of the considered

objectives

• Optimum in MOO: The solution that

represents the best tradeoff among the

considered objectives

Multi-objective optimization

• Pareto optimality: one of

the most accepted

notions of optimum

• (Some) MOO methods• (Some) MOO methods

are based in the concept

of dominance to

determine if a solution is

better than other

Pareto dominance: Solution x1 dominates x2 iff x1 is better than x2

in at least in one objective and it is not worse in the rest.

Multi-objective optimization

• A solution x* is a Pareto

optimum iff does not

exist another solution x´

such that x´dominate x*

• Problem: The output of a

MOO method is not a

single solution but an

approximation to the

Pareto optimal set

No solution is better than another in the Pareto optimal set.

Selecting a single solution is the job of the decision maker.

MULTI-OBJECTIVE EVOLUTIONARY

ALGORITHMS (NSGA-II)

Multi-objective Evolutionary Algorithms: two applications

Evolutionary Computing

• EC Is the collective name for a range of problem-

solving techniques based on principles of

biological evolution, such as natural selection

and genetic inheritance.and genetic inheritance.

• These techniques are being increasingly widely

applied to a variety of problems, ranging from

practical applications in industry and commerce

to leading-edge scientific research.

Evolutionary Computing

• Trial and error problem solving approach:

– While not_satisfied_with_solution

1. Generate candidate solution(s) for the problem at

handhand

2. Evaluate the quality of the candidate solution (s)

– Return best_solution_found

EC techniques generate new

solutions according to (rough)

analogies with biological

evolution principles

NSGA-II : (perhaps) the most used

MOEA

Non-dominated

sorting

NSGA-II : (perhaps) the most used

MOEA

Crowding distance

NSGA-II : (perhaps) the most used

MOEA

NSGA-II’s output

Hugo Jair Escalante, Alicia Morales. TIA-INAOE's approach for the 2013

MAXIMIXING VISUAL DIVERSITY OF

IMAGE RETRIEVAL RESULTS

Hugo Jair Escalante, Alicia Morales. TIA-INAOE's approach for the 2013

Retrieving Diverse Social Images task. MediaEval 2013 Workshop,

October 18-19, 2013, Barcelona, Spain, CEUR Workshop Proceedings, Vol.

1043, 2013

Diversification of retrieval results in

content-based image retrieval

• Given a list of images (relevant to a query), to

re-rank the list such that the visual diversity of

top-ranked images is maximized

Machu-Picchu in

the background

Retrieval model

Image

collection

Query

Diversification of retrieval results in

content-based image retrieval

• Given a list of images (relevant to a query), to

re-rank the list such that the visual diversity of

top-ranked images is maximized

? ? ? ? ?

? ? ? ? ?

• The 2013 Retrieving Diverse Social Images Task: Result diversification in social photo retrieval. Organizers: retrieval. Organizers:

– Provide data

• Ranked lists of documents

• Textual features, visual features, tags, comments, etc.

• Evaluation

– Evaluate participants

http://www.multimediaeval.org/mediaeval2013/diverseimages2013/

• Considered scenario:– A user searches for images of a specific location in social

media (e.g., Flickr)

– Text is used for searching

– The user wants that images in the first positions of the list– The user wants that images in the first positions of the listare visually diverse to each other

– Additionally, all of the images must be relevant:• About the searched location (GPS coordinates)

• No person in the image

• …

Casas Grandes

Chihuahua Mexico

Multi-objective optimization

for result diversification

• Idea: to re-rank the list of images such that a

tradeoff between relevance and diversity is

maximized

Multi-objective optimization

for result diversification

• NSGA-II is used to approach the problem as

follows:

Maximize < ρ(S0, S) , β(S) >

• Where:

Diversity termRelevance term

Maximize < ρ(S , S) , β(S) >

MORD: Representation

Each solution is thevector of scores togenerate the rankedlist

A solution to our

problem is a

ranked list of

images

MORD: Representation

1

2

Rank List

1

0.5

S0

3

4

5

6

… … …

0.3

0.25

0.2

0.16

… …

Initial population

Multi-objective optimization

for result diversification

• NSGA-II is used to approach the problem as

follows:

Maximize < ρ(S0, S) , β(S) >

• Where:

Diversity termRelevance term

Maximize < ρ(S , S) , β(S) >

Multi-objective optimization

for result diversification

• Diversity criterion:

? ? ? ? ?

? ? ? ? ?

Multi-objective optimization

for result diversification

• Diversity criterion:

? ? ? ? ?

? ? ? ? ?

Multi-objective optimization

for result diversification

• Diversity criterion:

? ? ? ? ?

? ? ? ? ?

MORD: Evolutionary stuff

• Initialization: Solutionsare generated by addingrandom numbers to theoriginal scores-vector

• Evolutionary operators:Standard cross-over andmutation operatorswere used

MORD: Selection of a single-solution

• We take the solution offering the best tradeoff between

both objectives

Experiments & results

• Three runs were submitted:

1. Visual

2. Textual

3. Visual+Textual

−17.2

−17

−16.8

−16.6

−0.98 −0.96 −0.94 −0.92 −0.9 −0.88 −0.86 −0.84 −0.82 −0.8−18.6

−18.4

−18.2

−18

−17.8

−17.6

−17.4

Relevance

Vis

ual d

iver

sity

Experiments & results

Initial list (7 topics in top-12 images)

Experiments & results

Re-ranked (8 topics in top-12 images)

Experiments & results

• Comparison with other participants: 6th out 11

0.65

0.7

0.75VISUAL

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

SO

TO

N−

UK

LIP

6−F

R

LAP

I−R

O

RG

U

BM

E−

HU

GE

NT

−B

E

BIL

KE

NT

−T

R

UE

C−

JP

CE

ALI

ST

−F

R

UM

RC

NR

S−

FR

TIA

INA

OE

−M

XTeam

Per

form

ance

P@10C@10F

1@1

Experiments & results

• Comparison with other participants: 5th out 11

0.75

0.8

0.85TEXTUAL

P@10C@10F

1@1

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

Team

Per

form

ance

SO

TO

N−

UK

LIP

6−F

R

LAP

I−R

O

RG

U

BM

E−

HU

GE

NT

−B

E

BIL

KE

NT

−T

R

UE

C−

JP

CE

ALI

ST

−F

R

TIA

INA

OE

−M

X

Experiments & results

• Comparison with other participants: 6th out of 11

0.9

1MULTIMEDIA

P@10C@10F

1@1

0.4

0.5

0.6

0.7

0.8

Team

Per

form

ance

SO

TO

N−

UK

LIP

6−F

R

LAP

I−R

O

RG

U

BM

E−

HU

GE

NT

−B

E

BIL

KE

NT

−T

R

UE

C−

JP

CE

ALI

ST

−F

R

TIA

INA

OE

−M

X

Conclusions

• The multi-objective formulation for RD ispromising, but not as effective as we expected

• The initial ranked list was not too reliable?

• No feature selection / special processing offeatures

• No feature selection / special processing offeatures

• Did not take advantage of meta-data (tags/comments/ etc.)

• Too many parameters/decisions to fix/take

Future work

• Alternative objective functions for both relevanceand diversity.

• Evaluation of the gains over single-objectivecombinatoric approachescombinatoric approaches

• Efficient implementation in GPUs

• Incorporating feature selection into theoptimization process

Hugo Jair Escalante, Maribel Marin-Castro, Mario Graff, Alicia Morales-

Reyes, Manuel Montes, Alejandro Rosales, Jesús A. González, Carlos A.

MOPG: MULTI-OBJECTIVE PROTOTYPE

GENERATION FOR CLASSIFICATION

Reyes, Manuel Montes, Alejandro Rosales, Jesús A. González, Carlos A.

Reyes. MOPG: Multi-objective prototype generation for classification.

Submitted to Pattern Recognition, October 12, 2013

KNN – classifier

• One of the most popular non-parametric

classifiers

• Easy to implement and very effective

• Main issues with KNN:

– The curse of dimensionality

– Efficiency

– Sensibility to noisy data

Positive examples

Negative examples

Prototype-based classification

• KNN classifiers using a subset ofthe original data

• The goal is to reduce thecomputational cost of standardcomputational cost of standardKNN, by filtering outnoisy/redundant instances andkeeping the most informativeones

• Key issue: how to select/obtainthe set of prototypes for aclassification problem?

Prototype generation

• Problem: To select a (small)

subset of instances such

that the classification

Positive examples

Negative examples

that the classification

performance of a particular

classifier (KNN) is not

degraded significantly

Accuracy vs reduction dilemma

• The two key aspects for the evaluation of PGmethods are: reduction and accuracy onunseen data

0.7

0.8

0.9

1PNNBTS3

MCA

GMCA

ICPL

MixtGaussSGPLVQ3 MSEDSM LVQTCVQAVQ

LVQPRUChen

RSP3ENPC

PSOAMPSOPSCSA

GPPC

• Maximizing reduction may cause accuracy todecrease and viceversa

0.64 0.66 0.68 0.7 0.72 0.74 0.760

0.1

0.2

0.3

0.4

0.5

0.6

GENN

Depur

HYB

POC

1NN

Accuracy

Red

uctio

n

MOPG: Multi-Objective

Prototype Generation

• Idea: approaching the PG problem as one of

multi-objective optimization, where the

objectives are: reduction and accuracy

• Goal: to obtain solutions that offer a good

tradeoff between both objectives, and then

select one for classification

MOPG: Multi-Objective

Prototype Generation

• NSGA-II is used to approach the following

problem:

• Where: f1(P) = δ(P, D); f2(P) = γ(P, D)

Hold-out classification

performance

Training set

reduction

MOPG: Representation

Each solution is codified

as matrix of size P x d

Instances

Features

A solution to our

problem is a set

of instances (the

prototypes)

MOPG: Initialization

• Training data is divided into development andvalidation partitions

• Development: Instances from which prototypes canbe generated

• Validation: Hold-out data set to evaluate solutions• Validation: Hold-out data set to evaluate solutions

• The partition is updated every iteration

• Initialization: For each class we randomly select a setof training instances (class distribution is mantained)

MOPG: Evolutionary operators

• Crossover: with uniform probability either

• Interchange (same-class) prototypesbetween solutions

• Replace a prototype of class k in onesolution with the average of allprototypes from class k in the otherprototypeprototypes from class k in the otherprototype

• Mutation: with uniform probability either

• Add a vector of random numbers to aprototype

• Replace a prototype with anotherinstance frmo the development set

MOPG: Selection of a single-solution

0.92

0.94ring

0.9

0.92

0.94banana

• We evaluate the performance of each solution in the

Pareto front and chose the one with highest accuracy

0.975 0.98 0.985 0.99 0.995 10.82

0.84

0.86

0.88

0.9

Acc

urac

y −

f 2(P)

Reduction − f1(P)

0.975 0.98 0.985 0.99 0.995 10.76

0.78

0.8

0.82

0.84

0.86

0.88

0.9

Acc

urac

y −

f 2(P)

Reduction − f1(P)

Pareto front for two sample data sets

Experiments and results

• We performed experiments over 59

classification problems of diverse

characteristics

• Compared the performance of our proposal to

that of 25 alternative prototype generation

techniques

I. Triguero, J. Derrac, S. García and F.Herrera, A Taxonomy and Experimental Study on Prototype Generation for Nearest

Neighbor Classification . IEEE Trans. on Systems, Man, and Cybernetics--Part C, 42 (1) (2012) 86-100, 2012

Experiments and results

I. Triguero, J. Derrac, S. García and F.Herrera, A Taxonomy and Experimental Study on Prototype Generation for Nearest

Neighbor Classification . IEEE Trans. on Systems, Man, and Cybernetics--Part C, 42 (1) (2012) 86-100, 2012

Experiments & results

• Evaluation of the selection strategy:

0.975 0.98 0.985 0.99 0.995 10.82

0.84

0.86

0.88

0.9

0.92

0.94

Acc

urac

y −

f 2(P)

Reduction − f1(P)

ring

0.975 0.98 0.985 0.99 0.995 10.76

0.78

0.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

Acc

urac

y −

f 2(P)

Reduction − f1(P)

banana

Experiments & results

• Parameter settings

Experiments & results

• Parameter settings

72

73

Acc

urac

y (%

)

98.66

98.68

Red

uctio

n (%

)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.969

70

71

Crossover / mutation rate

Acc

urac

y (%

)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

98.6

98.62

98.64

Red

uctio

n (%

)

Crossover / accuracyMutation / accuracyCrossover / reductionMutation / reduction

Experiments & results

• Comparison with related work

Experiments & results

• Comparison with related work

0.72

0.74

0.76 GENN

Depur

MCA

GMCA

ICPL

MSE

HYB

RSP3

ENPC

PSO

1NN

GPPC

MOPG

Acc

urac

ySmall data sets

0 0.2 0.4 0.6 0.8 10.64

0.66

0.68

0.7

PNN

BTS3

MixtGaussSGP

LVQ3DSM

LVQTC

VQ

AVQ

LVQPRU

Chen

POC

AMPSO

PSCSA

Reduction

Acc

urac

y

• Comparison with related work

0.7

0.72

0.74

0.76 GENN

Depur

MCA

GMCA

ICPL

MixtGaussSGP

MSE

LVQTC

HYB

LVQPRU

RSP3

ENPC

PSO

AMPSO

1NN

GPPC

MOPGA

ccur

acy

Small data sets

0.75

0.8

0.85

GENN

Depur

BTS3

MSEHYB Chen

RSP3

ENPC PSO

AMPSO

1−NNGPPCMOPG

Acc

urac

yLarge data sets

Experiments & results

0 0.2 0.4 0.6 0.8 10.64

0.66

0.68

0.7

PNN

BTS3

MixtGauss

LVQ3DSM

VQ

AVQ

Chen

POC

AMPSO

PSCSA

Reduction

Acc

urac

y

0 0.2 0.4 0.6 0.8 1

0.65

0.7

0.75 BTS3MixtGauss

SGP

LVQ3DSM

LVQTC

VQ

AVQ

LVQPRUAMPSO

PSCSA

Reduction

Acc

urac

y

0.5

0.6

0.7

0.8

Red

uctio

n

Small data sets

Experiments & results

0

0.1

0.2

0.3

0.4

GE

NN

Dep

ur

PN

N

BT

S3

MC

A

GM

CA

ICP

L

Mix

tGau

ss

SG

P

LVQ

3

MS

E

DS

M

LVQ

TC

VQ

AV

Q

HY

B

LVQ

PR

U

Che

n

RS

P3

PO

C

EN

PC

PS

O

AM

PS

O

PS

CS

A

1NN

GP

PC

MO

PG

PG method

Acc

urac

y ×

Red

uctio

n

Reduction – Accuracy tradeoff (reduction * accuracy)

0.4

0.5

0.6

0.7

0.8

Acc

urac

y ×

Red

uctio

n

Small data sets

0.5

0.6

0.7

0.8

0.9

Red

uctio

nLarge data sets

Experiments & results

0

0.1

0.2

0.3

GE

NN

Dep

ur

PN

N

BT

S3

MC

A

GM

CA

ICP

L

Mix

tGau

ss

SG

P

LVQ

3

MS

E

DS

M

LVQ

TC

VQ

AV

Q

HY

B

LVQ

PR

U

Che

n

RS

P3

PO

C

EN

PC

PS

O

AM

PS

O

PS

CS

A

1NN

GP

PC

MO

PG

PG method

Acc

urac

y

0

0.1

0.2

0.3

0.4

0.5

GE

NN

Dep

ur

BT

S3

Mix

tGau

ss

SG

P

LVQ

3

MS

E

DS

M

LVQ

TC

VQ

AV

Q

HY

B

LVQ

PR

U

Che

n

RS

P3

EN

PC

PS

O

AM

PS

O

PS

CS

A

1−N

N

GP

PC

MO

PG

PG method

Acc

urac

y ×

Red

uctio

n

Reduction – Accuracy tradeoff (reduction * accuracy)

Experiments & results

• Comparison with the best* methods (so far)

for PG

Conclusions

• The multi-objective formulation for PG is a promisingalternative to mono-objective approaches

• We hope our work can foster the development ofother multi-objective optimization methods for PG.

• We showed evidence supporting the hypothesis thatour proposal, MOPG, is very competitive in terms ofboth objectives reduction and accuracy

• MOPG outperforms most PG methods proposed sofar

Future work

• Devising better ways to select the best

solution from the Pareto front

• Efficient implementation of MOPG to deal• Efficient implementation of MOPG to deal

with big-data problems (GPUs)

• Adapt MOPG for the generation of visual

vocabularies

M. García-Limón, H. J. Escalante, E. Morales, A. Morales. Simultaneous Generation of Prototypes and Featuresthrough Genetic Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionary

SIMULTANEOUS GENERATION OF

PROTOTYPES AND FEATURES

through Genetic Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionarycomputation, pp. 517-524, (Full paper, Oral presentation), Vancouver, Canada, July, 12-17, 2014.

M. Alfonso García, H. J. Escalante, E. Morales. Towards Simultaneous Prototype and Feature Generation. Proc.of the XVI IEEE Autumn Meeting of Power, Electronics and Computer Science ROPEC 2014 INTERNACIONAL, pp.393—398, 2014.

Best MS Thesis on Artificial Intelligence 2015, (SMIA)

Simultaneous generation of features

and prototypes

• Is it possible to apply the same approach to

generate features?

• Is it possible to perform both feature and• Is it possible to perform both feature and

prototype generation simultaneously?

• A multi-objective formulation would further

help?

Simultaneous generation of features

and prototypes

• We aim to find a set of prototypes and featuressuch that:

– Accuracy is maximized

– Number of instances reduced

– Number of features is kept low– Number of features is kept low

• Proposed solution: Multi-objective GP

– Same idea: combine instances/features to generateprototypes/features.

– Multiobjective implementation (NSGA-II)

M. García-Limón, H. J. Escalante, E. Morales, A. Morales. Simultaneous Generation of Prototypes and Features through Genetic

Programming. GECCO '14 Proceedings of the 2014 conference on Genetic and evolutionary computation, pp. 517-524, (Full

paper, Oral presentation), Vancouver, Canada, July, 12-17, 2014.

Simultaneous generation of features

and prototypes

Simultaneous generation of features

and prototypes

• A different feature space for each class

NSGA-II : (perhaps) the most used

MOEA

Non-dominated

sorting

NSGA-II : (perhaps) the most used

MOEA

Crowding distance

NSGA-II : (perhaps) the most used

MOEA

NSGA-II’s output

Simultaneous generation of features

and prototypes

• We select a solution by looking at accuracy

only

Simultaneous generation of features

and prototypes

• Example:

– Original data set (initial instances and input space)

Simultaneous generation of features

and prototypes

• Example:

– Prototypes and input space for class 1

Simultaneous generation of features

and prototypes

• Example:

– Prototypes and input space for class 2

Simultaneous generation of features

and prototypes

• Some results:

Simultaneous generation of features

and prototypes

• Some results:

Small data sets

Large data sets

Simultaneous generation of features

and prototypes

Simultaneous generation of features

and prototypes

Simultaneous generation of features

and prototypes

• Competitive performance on generation of bothprototypes and features

• Class-specific input spaces• Class-specific input spaces

• Other uses: oversampling, data embedding, visualization,

• Issue: not scalable to large data sets

Questions?

top related