an evolutionary algorithm for query optimization in database

20
1 An Evolutionary An Evolutionary Algorithm for Query Algorithm for Query Optimization Optimization in Database in Database Kayvan Asghari, Ali Safari Mamaghani Mohammad Reza Meybodi International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering CISSE 2007 CISSE 2007

Upload: hop-gibson

Post on 30-Dec-2015

45 views

Category:

Documents


0 download

DESCRIPTION

An Evolutionary Algorithm for Query Optimization in Database. Kayvan Asghari, Ali Safari Mamaghani Mohammad Reza Meybodi International Joint Conferences on Computer, Information, and Systems Sciences, and Engineering CISSE 2007. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Evolutionary Algorithm for Query Optimization in Database

11

An Evolutionary Algorithm for An Evolutionary Algorithm for Query OptimizationQuery Optimization

in Databasein Database Kayvan Asghari,

Ali Safari Mamaghani

Mohammad Reza Meybodi

International Joint Conferenceson Computer, Information, and Systems Sciences, and Engineering

CISSE 2007CISSE 2007

Page 2: An Evolutionary Algorithm for Query Optimization in Database

22

IntroductionIntroductionOptimizing the database queries is one of hard research problems If the number of relations is more than five or six relations,

exhaustive search techniques will bear high cost regarding the memory and time.

Examples of queries with large number of relations can be found in:

Deductive database management systems Expert systems Engineering database management systems (CAD/CAM) Decision Support Systems Data mining Scientific database management systems

Whatever the reason, database management systems need the use of query optimizing techniques with low cost in order to counteract with such complicated queries.

Page 3: An Evolutionary Algorithm for Query Optimization in Database

33

Searching Algorithms for Suitable Join OrderSearching Algorithms for Suitable Join Order Exact algorithms that search all of state space and sometimes they reduce this

space by heuristic methods: Dynamic programming method which at first introduced by Selinger et al.

for optimizing the join ordering in System-R. Minimum selectivity algorithm KBZ algorithm AB algorithm

But Exhaustive search techniques like dynamic programming is suitable for queries with a few relation and we have to use random and evolutionary methods

Random algorithms have been introduced for showing the inability of exact algorithms versus large queries.

Iterative improvement Simulated annealing Two-phase optimization Toured simulated annealing Random sampling Evolutionary algorithms

Page 4: An Evolutionary Algorithm for Query Optimization in Database

44

Evolutionary algorithmsEvolutionary algorithms

Genetic algorithm has been done by Bennet et al. Some other works have been done by Steinbrunn et al. that

they have used different coding methods and genetic operators genetic programming which is introduced by Stillger et al. CGO genetic optimizer has also been introduced by Mulero et

al.

In our paper a hybrid evolutionary algorithm has been proposed that uses two methods of genetic algorithm

and learning automata synchronically for searching the states space of problem.

Page 5: An Evolutionary Algorithm for Query Optimization in Database

55

The Definition of ProblemThe Definition of Problem DBMS selects the best query execution plan (qep) from among

execution plans, in a way that query execution bears the low cost, especially the cost of input/output operations and time of processing.

If we show the all of allocated execution plans for responding to the query with S set, each member qep that belongs to S set has cost(qep) The purpose of each optimization algorithm is finding a member like qep0 which belongs to S set, so that:

Processing and Optimizing the join operators in query are difficult

Join operator considers two relations as input and combines their tuples one by one on the basis of a definite criterion and produce a new relation as output.

The join operator has associative and commutative features thus the number of execution plans for responding to a query increases exponentially when the number of joins among relations increases.

cost(qep)Sqepmin)0cost(qep

Page 6: An Evolutionary Algorithm for Query Optimization in Database

66

Learning automataLearning automata Learning automata approach for learning involves

determination of an optimal action from a set of allowable actions.

It selects an action from its finite set of actions.

Page 7: An Evolutionary Algorithm for Query Optimization in Database

77

Proposed Hybrid Algorithm for Solving Proposed Hybrid Algorithm for Solving Join Ordering ProblemJoin Ordering Problem

Combining genetic algorithms and learning automata Generation, penalty and reward are some of features

of hybrid algorithm. In proposed algorithm, unlike classical genetic

algorithm, binary coding or natural permutation representations aren't used for chromosomes.

Each chromosome is represented by learning automata of object migration kind

Each of genes in chromosome is attributed to one of automata actions, and is placed in a definite depth of that action.

Page 8: An Evolutionary Algorithm for Query Optimization in Database

88

Learning automata of object migration Learning automata of object migration kindkind

In these automata, is set of allowed actions of automata.

Is set of states and N is memory depth for automata.

Now consider the following query: (A∞C) and (B∞C) and (C∞D) and (D∞E)

An example of a query graph :

}, ... , α, α{αα k21

}, ... , , { KN 21

E

B

A

P1

P4P3

P2

C D E

Page 9: An Evolutionary Algorithm for Query Optimization in Database

99

Learning automata of object migration kindLearning automata of object migration kind

Display of joins permutation (p3, p2, p1, p4) by learning automata based on Tsetlin automata connections

Page 10: An Evolutionary Algorithm for Query Optimization in Database

1010

Fitness functionFitness function

The purpose of searching the optimized order of query joins is finding permutation of join operators, so that total cost of query execution is minimized in this permutation.

One important point in computing fitness function is the number of references to the disc so we can define the fitness function of F for an execution plan like qep as follows:

disk toreferences ofnumber The

1)( qepF

)()()( 2121 RRCRCRCCtotal

Page 11: An Evolutionary Algorithm for Query Optimization in Database

1111

Operators of Hybrid genetic algorithmOperators of Hybrid genetic algorithm

Selection operator: The selection used for this algorithm is roulette wheel.

Crossover Operator: In this operator, two parent chromosomes are selected and two genes i and j are selected randomly in one of the two parent chromosomes.

Mutation operator: For executing this operator, we can use different method which are suitable for work with permutations. For example in swap mutation, two actions (genes) from one automata (chromosome) are selected randomly and replaced with each other.

Page 12: An Evolutionary Algorithm for Query Optimization in Database

1212

Crossover OperatorCrossover Operator

Page 13: An Evolutionary Algorithm for Query Optimization in Database

1313

Mutation operatorMutation operator

Page 14: An Evolutionary Algorithm for Query Optimization in Database

1414

Penalty and Reward OperatorPenalty and Reward Operator

In each chromosome, evaluating the fitness rate of a gene which is selected randomly, penalty or reward is given to that gene.

As a result of giving penalty or reward, the depth of gene changes.

For example, in automata like Tsetlin connections, if p2 join be in states set {6,7,8,9,10}, and the cost for p2 join in the second action will be less than average join costs of chromosome, reward will be given to this join and it's Certainty will be increased and moves toward the internal states of that action.

Page 15: An Evolutionary Algorithm for Query Optimization in Database

1515

An Example of Reward OperatorAn Example of Reward Operator

Page 16: An Evolutionary Algorithm for Query Optimization in Database

1616

The manner of giving penalty to the join The manner of giving penalty to the join that is in a state except boundary statethat is in a state except boundary state

Page 17: An Evolutionary Algorithm for Query Optimization in Database

1717

The manner of giving penalty to the The manner of giving penalty to the join that is in a boundary statejoin that is in a boundary state

Page 18: An Evolutionary Algorithm for Query Optimization in Database

1818

Experiment ResultsExperiment Results

0

10000

20000

30000

40000

50000

60000

70000

10 20 30 40 50 60 70 80 90 100

Number of Join Operator

Cos

t of

Qu

ery

Exe

cuti

on

Automata GA GALA

Comparison of averaged cost obtained from hybrid algorism and

learning automata based on Tsetline and genetic

algorithm

0

10000

20000

30000

40000

50000

60000

70000

10 20 30 40 50 60 70 80 90 100Number of Join Operator

Cos

t of

Qu

ery

Exe

cuti

onAutomata GA GALA

Comparison of averaged cost obtained from hybrid algorithm and learning automata based on Krinsky and

genetic algorithm

Page 19: An Evolutionary Algorithm for Query Optimization in Database

1919

Experiment ResultsExperiment Results

0

10000

20000

30000

40000

50000

60000

70000

10 20 30 40 50 60 70 80 90 100Number of Join Operator

Cos

t of

Qu

ery

Exe

cuti

on

Automata GA GALA

Comparison of averaged cost obtained from hybrid

algorithm and learning automata based on Krylov

and genetic algorithm

0

10000

20000

30000

40000

50000

60000

70000

10 20 30 40 50 60 70 80 90 100Number of Join Operator

Cos

t of

Qu

ery

Exe

cuti

onAutomata GA GALA

Comparison of averaged cost obtained from hybrid

algorithm and learning automata based on Oomen

and genetic algorithm

Page 20: An Evolutionary Algorithm for Query Optimization in Database

2020

Experiment ResultsExperiment Results

0

10000

20000

30000

40000

50000

60000

70000

10 20 30 40 50 60 70 80 90 100

Number of Join Operator

Cos

t of

Que

ry E

xecu

tion

GALA - Krinisky GALA - KrylovGALA - Oomen GALA - Tsetlin

Comparison of averaged cost obtained from

hybrid algorithms based on different Automata

Comparison of averaged cost obtained from hybrid

algorithms based on different Automata depth

23000

28000

33000

38000

43000

1 3 5 7 9 11

Depth of Automata

Co

st o

f Q

ue

ry E

xe

cu

tio

nGALA - Krinisky GALA - Krylov

GALA - Oomen GALA - Tsetlin